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Division  of  Computer  Research  and  Technology 
Projects  Ending  FY  1992 


ZOl  CT00013-18  LSM  JE  Mosimann  Multivariate  Statistical  Analysis 

Principal  investigator  left  DCRT 

ZOl  CT00026-16  PSL  VA  Parsegian  Molecular  Forces  in  Cellular  Organization  and 

Split  into  two  new  projects  Function 

ZOl  CT00042-14  LAS  MA  Douglas    Image  Processing  in  Electron 

Project  completed  Microscopy/Xray/EEL  Spectroscopy 

ZOl  CTOOl 30-07  PSL  BK  Lee  Computer  Graphics 

Principal  investigator  left  DCRT 

ZOl  CT00132-08  LSM  JE  Mosimann  Consulting  Services 

Principal  mvestigator  left  DCRT 

ZOl  CT00148-07  CSL  RL  Marti  no      Neuromagnetotometer  Computer  System 

Project  completed 

ZOl  CTOOl  58-06  PSL  BK  Lee  Theoretical  Study  of  Protein  Stability 

Principal  investigator  left  DCRT 

ZOl  CTOOl 76-08  PSL  BK  Lee  Protein  Folding 

Prmcipal  investigator  left  DCRT 

ZOl  CT00223-01  LAS  JJ  Bailey  Clinical  and  Research  Use  of  Evoked  Potentials 

Project  transferred  to  outside  collaborators 

ZOl  CT00225-01  PSL  BK  Lee  Computing  with  an  Artificial  Neural  Network 

Principal  investigator  left  DCRT 
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NOTICE    OF    INTRAMURAL    RESEARCH    PROJECT 


PROJECT  NUMBER 

ZOl  CT00001-22LSM 


PERIOD  COVERED 

October  1,  1991  to  September  30,  1992 


TITLE  OF  PROJECT      (80  characters  or  less.   Title  must  fit  on  one  line  between  the  borders.) 

Automated  Data  Processing  of  Medical  Language 


PRINCIPAL  INVESTIGATOR  (List  other  professional  personnel  below  the  Principal  Investigator )    (Name,  title,  laboratory,  and  institute  affiliation) 


PI:        Mr.  George  Dunham 


others:    E.  Jaffe,  M.D. 


Computer  Systems  Analyst 


LSM,  DCRT,  NIH 


Chief,  Hematopathology  Section,  LP,  DCBD,  NCI,  NIH 


COOPERATING  UNITS  (if  any) 


LAB/BRANCH 


Laboratory  of  Statistical  and  Mathematical  Methodology 


SECTION 


Biomathematics  and  Computer  Science  Section 


INSTITUTE  AND  LOCATION   Division  of  Computer  Research  and  Technology,  NIH,  Bethesda,  MD  20892 


TOTAL  IVIAN-YEARS:      1 . 1 


PROFESSIONAL:   1.1 


OTHER:     0.0 


CHECK  APPROPRIATE  BOX(ES) 

n    (a)  Human  subjects 
□   (a1)  Minors 

D  (a2)  Interviews 


D 


(b)  Human  tissues 


H 


(c)  Neither 


SUMIvlARY  OF  WORK    (Use  standard  unreduced  type.    Do  not  exceed  the  space  provided.) 

The  major  objective  of  the  project  is  the  development  of  methods  for  automatic  processing  of  natural 
medical  language.  Crucial  information  of  patient  records  is  embedded  in  natural  language  text  generated 
during  physical  exams  or  in  reports  from  various  hospital  laboratories.  Precise  retrieval  of  subsets  of 
patient  data  via  this  information  is  needed  to  increase  the  scientific  value  of  this  latent  information  pool 
for  retrospective  studies,  including  studies  of  drug  effects,  and  for  teaching  purposes.  The  spreading  of 
this  technology  beyond  a  limited  domain  depends  upon  developing  "intellegent"  algorithms  able  to  learn 
the  highly  particular  semantic  models  and  language  syntax  governing  the  language  used  in  specific 
micro-domains  of  medical  knowledge. 

Developed  technology  is  utilized  in  the  automatic  encoding  of  Surgical  Pathology  reports  for 
NCI/DCBD/LP,  which  become  a  source  of  problems  and  examples  in  medical  lexicography  and 
semantic  modeling  of  medical  information. 

A  logic-based  software  platform  (Lexicographic  Environment  Software)  is  under  development  to 
facilitate  manipulation  of  acyclic  directed  graph  dictionaries  and  research  in  medical  lexicography,  "high 
resolution"  representation  and  query  of  medical  information,  and  induction  and  evolution  of  syntactic 
and  semantic  rules. 

Experiments  with  artificial  neural  network  solutions  to  problems  arising  in  natural  language  were 
performed. 
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DEPARTMENT  OF  HEALTH  AND  HUMAN  SERVICES  -  PUBLIC  HEALTH  SERVICE 

NOTICE  OF  INTRAMURAL  RESEARCH  PROJECT 


PROJECT  NUMBER 


ZOl    CT00002-23    LAS 


PERIOD  COVERED 

October  1,  1991  to  September  30,  1992 


TITLE  OF  PROJECT  i80  characters  or  less.    Tide  must  fit  on  one  line  between  the  borders.) 

Computer-Aided  Analysis  of  Electrocardiology 


PRINCIPAL  INVESTIGATOR  IList  other  professional  personnel  below  the  Principal  Investigator. I  IName,  title,  laboratory,  and  institute  affiliationi 

PI:       James  J.  Bailey,  M.D.     Section  Chief      (DCRT/LAS) 


Others:   Erik  W.  Pottala,  Ph.D. 
Gregory  Campbell,  Ph.D. 
D.  Levy,  M.D 

J.E.  Norman,  Ph.D 
D.  MacAreavey.  M.D. 


Senior  Engineer 
Acting  Lab  Chief 
Cardiologist 

Statistician 
Cardiologist 


(DCRT/LAS) 
(DCRT/LSM) 
(Framingham  Heart 
Study) 
(NHLBI/FSB) 
(NHT,RT/CB) 


COOPERATING  UNITS  lif  any) 

Framingham  Heart  Study,  Cardiology  Branch,  NHLBI 
Field  Studies  and  Biometry  Branch,  NHLBI 


LAB/BRANCH 

Laboratory  of  Applied  Studies 


SECTION 

Medical  Applications  Section 


INSTITUTE  AND  LOCATION 

DCRT ,    NIH,     Bethesda,    MD 


10892 


TOTAL  STAFF  YEARS: 

0.7 


PROFESSIONAL: 

0.5 


OTHER: 

0.2 


CHECK  APPROPRIATE  BOX(ES) 

n   (a)  Human  subjects     D   (b)  Human  tissues 
D   (a1)  Minors 
D   (a2)  Interviews 


(c)  Neither 


SUMMARY  OF  WORK  lUse  standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

These  studies  are  directed  toward  evaluation  of  the  prognostic  power  of  the 
electrocardiogram  when  analyzed  by  advanced  computer  methodology  and  the  predictive 
accuracy  of  diagnostic  criteria  when  implemented  in  ECG  computer  programs.   Digital 
signal  processing  of  the  electrocardiogram  is  a  problem  area  requiring  considerable 
engineering  and  computer  science  expertise  as  well  as  knowledge  about  its  clinical 
relevance.   The  use  of  well-documented  populations  and  multivariate  statistical 
techniques  in  designing  new  criteria  are  also  subjects  under  investigation. 
Studies  have  been  pursued  in  collaboration  with  NHLBI  and  the  Framingham  Heart 
Study. 

The  Framingham  Heart  Study  has  shown  that  left  ventricular  hypertrophy  (LVH)  is  an 
independent  risk  factor  for  pathological  cardiac  events;  an  important  screening 
modality  for  LVH  is  the  routine,  diagnostic  electrocardiogram  since  more  the  100 
million  are  performed  annually  in  the  U.S.   Collaborative  studies  show  that  when 
age  and  indices  of  body  habitus,  stratified  by  sex,  are  combined  with  ECG 
parameters,  the  sensitivity  and  specificity  of  LVH  diagnoses  are  significantly 
improved,  which  would  result  in  more  and  earlier  diagnoses. 

Another  important  diagnostic  modality  is  the  ambulatory  electrocardiogram  (AECG) , 
which  by  virtue  of  collecting  hours  to  okays  data  has  much  greater  potential  for 
evaluation  of  symptoms  that  may  be  caused  by  cardiac  arrhythmias  or  myocardial 
ischemia  as  well  as  a  tool  for  detecting  alterations  in  cardiac  electrophysiology 
that  may  have  prognostic  significance.   Studies  of  the  methodology  for  analyzing 
AECGs  are  being  pursued  in  collaboration  with  NHLBI. 
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NOTICE  OF  INTRAMURAL  RESEARCH  PROJECT 


PROJECT  NUMBER 


ZOl    CT00003-22    LAS 


PERIOD  COVERED 

October  1,  1991  to  September  30,  199: 


TITLE  OF  PROJECT  [80  characters  or  less.    Title  must  fit  on  one  line  between  the  borders.) 

Computer  Systems  and  Applications  for  Nuclear  Medicine 


PRINCIPAL  INVESTIGATOR  (List  other  professional  personnel  below  the  Principal  Investigator. I  flMame.  tide,  laboratory,  and  institute  affiliation) 

PI:       Margaret  A.  Douglas,  B.A.   Comp .  Sys .  Analyst      (DCRT/LAS) 


Others:   James  J.  Bailey,  M.D. 
Paul  Kalkowski,  B.S. 
Stephen  L.  Bacharach,  Ph.D. 
Michael  V.  Green,  M.S. 


Section  Chief  (DCRT/LAS) 

Comp.  Sys.  Analyst  (DCRT/LAS) 

Med.  Physicist/Chief  (CC/NM) 

Section  Chief  (CC/NM) 


COOPERATING  UNITS  lif  any! 


LAB/BRANCH 

Laboratory  of  Applied  Studies 


SECTION 

Medical  Applications  Section 


INSTITUTE  AND  LOCATION 

DCRT,  NIH,  Bethesda,  MD   20892 


TOTAL  STAFF  YEARS: 

1.8 


PROFESSIONAL: 

1.6 


OTHER: 

0.2 


CHECK  APPROPRIATE  BOX(ESI 

D   (a)  Human  subjects     D   (b)  Human  tissues 
D   (a1 )  Minors 
D   (a2)  Interviews 


(c)  Neither 


SUMMARY  OF  WORK  lUse  standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

LAS  develops  systems  for  computer-based  mathematical  analysis,  pattern  recognition 
and  image  processing  in  support  of  diagnostic  activities  in  the  Nuclear  Medicine 
Department  of  the  Clinical  Center  and  collaborating  Institutes.   Many  applications 
are  directed  toward  the  correlation  of  function  with  structure,  such  as:  estimation 
of  ventricular  function  from  radionuclide  ventriculography  or  PET  scan  (functional 
data)  compared  to  MRI  or  CT  scans  (anatomical  data) .   The  primary  application  has 
been  in  the  analysis,  registration,  and  segmentation  of  cardiac  image  data.   An 
automated  system  has  been  developed  for  three-dimensional  registration  of  cardiac 
PET  emission  data  for  a  subject  over  repeated  studies  without  use  of  fiducial 
points  or  contours.   Systems  are  being  developed  for  the  creation  of  sequences  of 
projections  of  volumetric  data,  alignment  of  projection  to  tomographic  data  and  for 
application  of  two-dimensional  regions-of-interest  to  translated  and  rotated 
volumetric  data.   Other  problems  involve  projection  of  volumetric  data  to  a  two- 
dimensional  CRT  screen  in  ways  that  are  meaningful  and  useful  to  researchers  and 
the  problem  of  how  best  to  visualize  volumetric  PET  superimposed  on  an  MRI  or  CT 
volume  with  variable  levels  of  transparency.   Computer  platforms  used  in  this 
project  include  the  Vax,  Macintosh,  IBM-compatible  PC,  and  UNIX  workstations. 

LAS,  in  collaboration  with  the  Nuclear  Medicine  Department  of  the  Clinical  Center 
has,  over  the  past  six  years,  designed  and  specified  a  general-purpose  image 
processing  system,  MIRAGe.   Programming  was  performed  by  contractors  supervised  by 
LAS  and  the  Nuclear  Medicine  Department.   The  completed  basic  system  has  been 
ported  to  several  other  NIH  computer  systems  including  Vax  workstations  and 
Macintosh  systems.   Many  academic  and  commercial  institutions  across  North  America 
and  Europe  have  requested  and  received  copies  of  the  system.   Work  has  begun  on  the 
next  generation  system  based  on  a  UNIX  workstation.   This  system  will  incorporate 
the  functionality  of  MIRAGe  with  advanced  3D  visualization,  analysis  and 
registration  capabilities. 
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NOTICE  OF  INTRAMURAL  RESEARCH  PROJECT 


PROJECT  NUMBER 


ZOl  CT00004-22  LAS 


PERIOD  COVERED 

October  1,  1991  to  September  30,  1992 


TITLE  OF  PROJECT  ISO  characters  or  less.    Title  must  fit  on  one  line  between  the  borders.) 

Analysis  of  Physiological  Signals 


PRINCIPAL  INVESTIGATOR  (List  other  professional  personnel  below  the  Principal  Investigator.)  (Name,  v'tJe,  laboratory,  and  institute  affiliation) 

PI:       Erik  W.  Pottala,  Ph.D.   Senior  Engineer      (DCRT/LAS) 


Others:   James  J.  Bailey,  M.D. 
J.  A.  Dvorak,  Ph.D. 
K.L.  Rasmussen,  Ph.D. 


Section  Chief        (DCRT/LAS) 
Senior  Investigator  (NIAID/LPD) 
Senior  Staff  Fellow  (NICHD/LCE) 


COOPERATING  UNITS  lif  any) 

NIAID,  NICHD 

Medical  College  of  Ohio,  Creighton  University  (R.W. 

University  of  Puerto  Rico  (E.C.  Phoebus.  Ph.D.) 


Bowser,  B . Sc . ) , 


LAB/BRANCH 

Laboratory  of  Applied  Studies 


SECTION 

Medical  Application  Section 


INSTITUTE  AND  LOCATION 

DCRT,    NIH,     Bethesda,    MD      20892 


TOTAL  STAFF  YEARS: 

0.9 


PROFESSIONAL: 
0.8 


OTHER: 

0.1 


CHECK  APPROPRIATE  B0X(ES| 

D   (a)  Human  subjects     D   (b)  Human  tissues 
D   (al)  Minors 
D   (a2)  Interviews 


(c)  Neither 


SUMMARY  OF  WORK  {Use  standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

This  project  involves  the  development  and  application  of  microcomputer-based  signal 
processing  techniques  for  analysis  of  physiological  signals  e.g., 
electrocardiogram,  electromyogram,  and  electroencephalogram.   The  LAS 
microcomputer-based  systems  provide  a  general  purpose  analog-to-digital  conversion 
facility  and  an  ability  to  filter  the  signals  with  a  variety  of  analog  and  digital 
techniques  (before  and/or  after  A/D  conversion) . 

An  important  component  of  this  project  is  the  modeling  of  the  physiological  system 
that  produces  the  signals.   To  serve  this  purpose  LAS  has  acquired  and  is  testing 
several  software  packages  including  SIMULINK,  NEURALNET  TOOLBOX,  and  SYSTEM 
IDENTIFICATION  SOFTWARE  (supplied  by  Mathworks)  and  HISPEC  (supplied  by  United 
Signals  and  Systems,  Inc.),  which  uses  autoregressive  modeling  to  produce  an 
autocorrelation  function  and  Fourier  transform  to  produce  very  high  resolution 
power  spectra.   These  packages  are  being  integrated  into  MATLAB. 

If  these  software  packages  prove  themselves  in  tests  with  real  and/or  simulated 
physiological  signal  data,  they  will  be  implemented  on  Macintosh  platforms.   A  main 
objective  of  this  project  includes  methodology  for  guaranteeing  the  fidelity  of 
physiological  signals  which  can  be  critically  important  to  diagnostic 
interpretation  (e.g.  in  electrocardiology)  .   A  further  objective  is  to  use  advanced 
mathematical  and  modeling  techniques  to  separate  pathophysiology  from  normal. 
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ZOl  CT00008-19LSM 


PERIOD  COVERED 

October  1.  1991-  September  30,  1992 


TITLE  OF  PROJECT      {80  characters  or  less.   Title  must  fit  on  one  line  between  the  borders.) 

DNAdraw  for  the  Macintosh/Computer  Software  for  DNA  Sequence  Display  and  Analysis 


PRINCIPAL  INVESTIGATOR  (List  other  professional  personnel  below  the  Principal  Investigator.)    (Name,  title,  laboratory,  and  institute  affiliation) 


PI:         Mr.  Marvin  Shapiro 


Research  Math.  Statistician 


LSM,  DCRT,  NIH 


Others: 


COOPERATING  UNITS  (if  any) 


LAB/BRANCH 


Laboratory  of  Statistical  and  Mathematical  Methodology 


SECTION 


Statistical  Methodology  Section 


INSTITUTE  AND  LOCATION   Divislon  of  Computcr  Research  and  Technology,  NIH,  Bethesda,  MD  20892 


TOTAL  fylAN-YEARS:      0.4 


PROFESSIONAL:  0.4 


OTHER:     0.0 


CHECK  APPROPRIATE  BOX(ES) 

n    (a)  Human  subjects 
□   (a1)  Minors 

D  (a2)  Interviews 


n 


(b)  Human  tissues 


H 


(c)  Neither 


SUI^f^ARY  OF  WORK    (Use  standard  unreduced  type.    Do  not  exceed  the  space  provided.) 

A  previously  developed,  in-house  DOS  program  for  drawing  DNA  sequences  has  received  considerable 
use  both  within  the  NIH  community  and  more  widely.  However,  based  on  its  inadequacies  and 
numerous  requests  for  a  version  running  on  the  Macintosh,  work  has  begun  preparing  a  completely  new 
version  of  DNAdraw  for  the  Macintosh.  It  will  do  essentially  the  same  job,  i.e.,  formatting  sequence 
data  and  drawing  highhghted  sequences  for  publication,  but  it  will  have  a  number  of  significant 
improvements  over  the  PC  version.  First,  being  on  the  Macintosh  and  conforming  to  the  standard 
Macintosh  principles,  it  will  be  immediately  usable,  with  little  or  no  reference  to  a  manual  required.  The 
Macintosh  system  of  menus  will  make  the  specification  and  drawing  of  highlights  extremely  simple  for 
the  user.  In  addition  it  will  use  the  capabilities  of  the  mouse  to  make  interaction  with  the  program  much 
easier  than  the  PC  version. 

There  is  great  emphasis  now  on  finding  sequences  homologous  to  a  given  sequence  and  producing  an 
alignment  of  the  group.  A  new  feature  of  the  DNAdraw  program  allows  for  automatic  highhghting  to 
indicate  which  parts  of  such  sequences  are  homologous.  This  is  done  instantaneously  after  a  short 
dialog  with  the  user. 

Since  most  sequencing  laboratories  at  NIH  are  now  using  the  Macintosh  for  their  computer  support,  this 
new  version  of  DNAdraw  should  get  wide  use. 
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DEPARTMENT  OF  HEALTH  AND  HUMAN  SERVICES  -  PUBLIC  HEALTH  SERVICE 

NOTICE  OF  INTRAMURAL  RESEARCH  PROJECT 


PROJECT  NUMBER 


ZOl    CTOOlO-18    LAS 


PERIOD  COVERED 

October  1,  1991  to  September  30,  1992 


TITLE  OF  PROJECT  (80  characters  or  /ess.    Title  must  fit  an  one  tine  between  the  borders.) 

Mathematical  and  Computational  Methods  for  Solving  Nonlinear  Equations 


PRINCIPAL  INVESTIGATOR  {List  other  professional  personnel  betow  the  Principal  Investigator.}  (Name,  title,  laboratory,  and  institute  affiliationj 


PI:       Richard  I.  Shrager,  M.S. 

Others:   P.  McPhie,  Ph.D. 

R.  Berger,  Ph.D. 

R.  Handler,  Ph.D. 

A.  Alayash,  Ph.D. 

J.  Fletcher,  Ph.D. 

J.  Bailey,  M.D. 
M.  Baseler.  Ph.D. 


Research  Mathematician  (DCRT/LAS) 

Biophysicist  (NIDDK/LMB) 

Section  Chief  (NHLBI/LC) 

Section  Chief  (NHLBI/LC) 

Research  Associate  (CBER) 

Acting  Lab  Chief  (DCRT/LAS) 

Section  Chief  (DCRT/LAS) 

Head (PRT/FCRC) 


COOPERATING  UNITS  (if  anyl 

LAIR  Blood  Res.  (R.  Winslow,  M.D.,  K.  Vandegriff,  Ph.D. 
Duke  Univ.  Marine  Biomed.  Ctr.  (C.  Bonaventura,  Ph.D.) 
Hungarian  Acad.  Sci.  (Z.  Dancshazy,  Ph.D.) 


and  V.  McDonald,  M.D. 


LAB/BRANCH 

Laboratory  of  Applied  Studies 


SECTION 

Mathematical  Analysis 


INSTITUTE  AND  LOCATION 

DCRT,    NIH,    Bethesda,    MD      20892 


TOTAL  STAFF  YEARS: 

0.9 


PROFESSIONAL: 

0.7 


OTHER: 

0.2 


CHECK  APPROPRIATE  BOX(ES) 

D   (a)  Human  subjects 
D    (al)  Minors 
D   (a2)  Interviews 


D   (b)  Human  tissues       Kl   (c)  Neither 


SUMMARY  OF  WORK  (Use  standard  unreduced  type.  Do  not  exceed  the  space  provided.! 

Typical  tools  used  by  laboratory  investigators  involve  the  use  of  least  squares 
methods,  quadratic  programming,  simplex  methods,  or  derivative  free  methods 
(Nelder-Mead)  to  fit  models  to  experimentally  obtained  data.   Functional  models  are 
usually  resolved  by  such  methods.   Other  data,  such  as  complex  combinations  of 
Icnown  or  reference  spectra  are  more  readily  resolved  by  matrix  methods  such  as 
Singular  Value  Decomposition.   In  such  cases,  the  "singular  values"  signal  the 
relative  importance  of  the  component  spectra  in  the  decomposition  of  the  complex 
measured  spectra.   Such  methods  can  be  developed  from  mathematical  software 
pacl<:ages,  such  as  MATLAB,  for  a  number  of  computing  platforms.   As  a  result,  such 
mehtods  are  portable  to  any  machine  for  which  MATLAB  has  been  developed. 

The  purpose  of  thu  project  is  to  provide  NIH  investigators  with  mathematical  tools 
for  insight,  analysis,  and  solution  of  complex  equations  that  arise  in  the  modeling 
of  biological  systems.   To  facilitate  these  efforts,  LAS  developed  mathematical 
methods  that  are  accessible  to  investigators  from  many  disciplines.   Software 
packages  that  result  from  these  developments  are  made  available  to  the  research 
community  as  general  research  tools.   Advice  on  the  use  of  certain  commercial 
mathematical  software  packages  is  also  offered. 

This  project  is  currently  involved  with  applications  in  several  diverse  areas, 
e.g.,  deconvolution  of  bilirubin-pheuobarbital  and  heme  biosynthesis  and  clearance; 
circular  dichrosim  spectra  modeling  in  thermal  unfolding  of  sevine  pepsinogen; 
resolution  of  forward  rate  binding  measurements  in  hemoglobin;  regression  analysis 
of  oxygenation  isotherms;  rapid  scanning  spectrophotometry  for  oxygen  binding  to 
hemoglobin;  bacteriorhodopsin  recovery  from  laser  flash;  and  cytochrome-a-a3 
kinetics. 
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A  manuscript  was  submitted  describing  a  general  criterion  and  two  applications  of  it.  The  criterion 
allows  one  to  distinguish  between  rings  with  the  same  "external  action."  Rings  are  algebraic  structures 
with  addition  and  multiplication  satisfying  certain  rules.  For  example,  the  ordinary  decimal  numbers 
form  a  ring,  as  do  the  fractions,  as  do  the  integers,  as  do  the  systems  of  integer  remainders  obtained 
when  dividing  by  a  fixed  positive  integer.  There  are  also  rings  constructed  in  more  complicated  ways, 
by  taking  matrices  (arrays  of  numbers)  for  example.  The  study  of  rings  is  a  major  part  of  modem 
algebra.  The  "external  actions"  of  rings  distinguished  by  our  new  criterion  are  difficult  to  describe  in 
nontechnical  terms.  Essentially,  the  algebraic  properties  of  a  ring  can  be  viewed  "from  the  outside" 
using  structures  that  arise  from  the  ring.  We  say  that  two  rings  have  the  same  external  action  if  the  two 
rings  lead  to  the  same  "outside"  view.  Rings  with  the  same  outside  view  must  have  the  same 
"characteristic"  which  is  zero  for  such  rings  as  the  decimal  numbers  or  fractions,  but  is  the  integer  d  for 
the  ring  of  integer  remainders  after  division  by  d. 

All  rings  with  the  same  characteristic  are  known  to  have  the  same  extemal  action  if  the  characteristic  is  a 
prime  number  or  a  product  of  distinct  primes.  An  application  of  the  general  criterion  shows  that  there 
are  infinitely  many  different  extemal  actions  corresponding  to  each  other  possible  ring  characteristic. 
Another  application  shows  that  a  ring  need  not  have  the  same  extemal  action  as  its  opposite  or  dual  ring, 
obtained  by  reversing  the  factors  in  multiplications. 

The  findings  of  this  study  are  theorems  of  pure  mathematics.  The  main  purpose  of  such  studies  is  to 
better  understand  the  mathematical  tools  that  have  been  successfully  applied  to  many  problems  of 
science  and  engineering.  There  are  numerous  historical  examples  of  advances  in  science  and 
technology,  often  unpredictable,  resulting  from  such  improvements  in  theoretical  understanding. 
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This  project  focusses  on  the  basic  accuracy  of  NMR  measurements  and  the  design  of  experiments  for 
such  measurements  to  minimize  errors  due  to  instrumental  noise.  We  have  made  experimental 
measurements  on  a  number  of  spectrometers  to  verify  the  absence  of  significant  correlations  in  the  noise 
which  would  necessitate  changes  in  the  analysis.  A  natural  extension  of  our  earlier  work  on  optimal 
design  of  NMR  experiments  is  to  adaptive  designs,  to  reduce  instrumental  running  times  and  improve 
the  precision  of  estimates  of  spin  lattice  relaxation  times.  This  parameter  finds  wide  application  in 
chemistry  and  medical  imaging.  We  have  developed  a  set  of  partially  adaptive  optimal  designs  which 
allow  for  two  stages,  in  estimating  Ti,  the  design  of  the  second  being  based  on  results  from  the  first. 
This  is  relatively  simple  to  implement  and  generally  reduces  the  running  time  and  increases  precision. 
An  additional  extension  of  the  theory  is  that  of  designing  optimal  experiments  for  the  measurement  of 
rate  constants  for  in-vivo  experiments.  This  work  is  still  in  progress. 
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Various  investigations  were  undertaken  to  establish  the  foundations  of  optically-based  biomedical 
measurement  techniques.  Thus,  we  worked  on  mathematical  models  to  link  the  fundamental  optical 
properties  of  biological  tissue  to  measureable  quantities,  such  as  the  diffuse  surface  reflectance  or  time- 
resolved  transmittance,  needed  to  develop  noninvasive  therapeutic  and  diagnostic  applications  of  light  in 
medicine.  Particular  attention  was  given  to  understanding  how  light  is  transmitted  within  materials 
when  optical  heterogeneity  is  important,  for  example  in  media  which  contain  statistically  disordered 
internal  boundaries  (e.g.,  lung,  bone  tissue  microvasculature).  We  showed  how  the  intensities  and 
pathlengths  of  photons  reemitted  at  an  illuminated  surface  depend  on  the  fractal  dimensions  of  scattering 
inclusions  and  other  internal  structures.  We  also  examined  whether  light  might  be  used  to  detect  small 
absorptive  inclusions  hidden  in  a  multiply  scattering  optical  medium.  Such  studies  were  undertaken  to 
support  development  of  technologies  which  have  as  their  goal  the  noninvasive  detection  of  tumors  or 
other  optically  distinguishable  targets.  Numerical  methods  were  devised  to  facilitate  computer  analysis 
of  schemes  that  utilize  diffusely  reflected  or  transmitted  light  to  locate  a  hidden  object.  Computer 
simulations  also  were  performed  to  understand  how  the  probabilistic  nature  of  photon  migration  affects 
laboratory  measurements  of  reemitted  light  intensities. 

We  also  were  involved  in  several  projects  that  use  scattering  and  other  measurement  techniques  to 
examine  relationships  between  the  physical  properties  and  molecular  structure  of  biological  and  chemical 
gels.  Several  of  the  latter  are  important  in  various  areas  of  biotechnology,  and  protein  gels  and  other 
extended  polymer  matrices  play  significant  roles  in  many  cell  biological  processes  (e.g.,  cell  cotility  and 
wound  healing).  In  collaboration  with  other  researchers,  we  used  rheological  techniques  to  characterize 
the  intermolecular  bonds  of  proteoglycan  cellular  matrix  material.  We  also  performed  small-angle 
neutron  scattering  (SANS)  measurements  to  examine  interactions  occuring  between  polymer  strands  in 
agarose  gels.  These  studies  were  undertaken  to  extend  our  general  knowledge  of  the  behaviors  of  this 
important  class  of  biological  materials,  as  well  as  to  characterize  the  particular  samples  chosen  for  study. 
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This  project  encompasses  a  number  of  topics  in  physics  and  applied  mathematics  related  to  chemical 
reaction  rates,  the  qualitative  behavior  of  dynamical  systems,  blood  flow  in  capillaries  and  the 
development  of  theory  related  to  problems  in  optical  imaging  in  turbid  media.  The  use  of  lasers  for 
diagnostic  purposes  are  based  on  such  a  theoretical  underpinning.  We  have  provided  a  theoretical 
explanation  of  phenomena  observed  in  measurements  of  plaque  buildup  in  human  tissues.  A  study  of 
the  chemical  reaction  A+B->C  in  a  one  dimensional  geometry  has  been  expanded  by  deriving  more 
detailed  solutions  to  the  underlying  diffusion-reaction  equations.  By  carrying  out  this  generahzation  we 
have  demonstrated  the  theoretical  possibility  of  having  a  reaction  front  that  moves  non-monotonicaUy  as 
a  function  of  time.  Concurrent  experiments  by  Professor  R.  Kopelman  at  the  University  of  Michigan 
have  verified  that  this  indeed  occurs  in  real  chemical  systems. 

A  third  project  dealt  with  the  extension  of  present  models  for  indicator-dilution  models  which 
specifically  model  rate  processes  as  being  of  first  order.  These  are  widely  used  in  the  interpretation  of 
physiological  experiments  on  the  exchange  of  molecules  between  tissue  and  blood  vessels.  Current 
theories  are  based  on  very  specific  models  which  have  not  been  checked  experimentally.  We  have 
shown  how  to  derive  a  non-Markovian  version  of  the  theory,  which  permits  the  description  of 
qualitatively  different  behavior  than  that  following  from  the  standard  models.  These  also  form  a  more 
general  framework  for  interpreting  experimental  data. 
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This  project  develops  new  methods  in  statistics,  both  theoretical  and  applied,  using  methods  of 
advanced  algebra.  Results  have  been  obtained  in  the  systematization  of  the  general  linear  mixed  model 
and  in  the  analysis  of  data  having  a  structured  pattern  of  correlation. 

For  biomedical  data  using  repeated  measurements  on  the  same  case,  it  is  often  found  that  one  or  more 
data  points  are  missing  or  were  not  obtained.  Classical  methods  for  analyzmg  such  data  require  that 
such  cases  (e.g.,  subjects)  be  completely  dropped  from  the  analysis,  despite  the  usually  large  amount  of 
data  that  had  been  obtained  on  the  same  case.  In  order  to  satisfy  the  standard  mathematical  and  statistical 
conditions  for  the  analysis,  such  deletions  often  require  that  half  or  more  of  all  cases  be  deleted.  This  is 
an  inefficient  use  of  biomedical  data  that  is  often  difficult  and  costly  to  obtain,  and  using  just  the  reduced 
data  that  was  collected  can  lead  to  spurious  findings. 

On  the  other  hand,  the  Expectation-Maximization  algorithm  of  Dempster,  Laird,  and  Rubin  [1977] 
has  been  in  use  for  some  time  as  a  broadly  successful  antidote  to  this  problem  of  missing  data.  The 
basic,  iterative  algorithm  is  well-known,  but  is  also  well-known  to  have  convergence  problems  that  are 
hard  to  diagnose  and  get  around. 

Using  an  idea  first  proposed  by  Rubin  and  Szatrowski  [1982],  we  give  a  complete  solution  to  this 
problem  above  using  methods  of  advanced  algebra  (technically:  Jordan  algebras).  And  now  some  other 
well-known  statistical  methods  are  shown  to  work  precisely  because  of  an  implicit  use  of  Jordan 
algebras,  and  so  are  special  cases  of  our  results. 

Our  algorithm  finds  estimates  for  the  total  variation  in  an  experiment,  even  when  this  variation  is 
known  to  be  constrained  by  any  set  of  linear  restrictions.  Combined  with  rigorous,  large-sample 
statistical  approximations,  the  researchers  can  more  systematically  probe  for  effects  in  measurements 
taken  over  time  (e.g.,  true  variation  vs.  "noise,")  without  having  to  delete  cases. 

Thus,  in  the  context  of  biomedical  data  (frequently  having  many  missing  data  points),  the  new 
methods  apply  to  growth  curve  models,  variance  components  analysis,  genetic  linkage 
analysis,  time  series  data,  and  to  longitudinal  data  that  is  often  acquired  in  clinical  trials,  or  in 
epidemiological   case-control   studies. 

A  research  monograph.  Statistical  Applications  of  Jordan  Algebras  was  anonymously 
peer-reviewed  and  is  in  press  at  Springer- Verlag,  Publishers. 
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This  project  consists  of  the  development  of  numerical  methods  and  mathematical 
software  for  the  solution  of  ordinary  and  partial  differential  equations  that 
describe  dynamic  physiological  processes.  Many  biological  processes  can  be 
described  by  systems  of  ordinary  or  partial  differential  equations.   Most,  but  not 
all,  of  these  systems  are  nonlinear  and  often  include  multiple  time  scales  i.e., 
there  is  a  "fast"  in  time  phase,  perhaps  and  intermediate  phase,  and  a  "slow"  or 
longer  phase  to  show  the  complete  behaviorial  cycle  of  the  process.   Such  systems 
are  not  easily  or  casually  treatable  by  "standard"  numerical  methods.   This  project 
is  concerned  with  developing  or  adopting  numerical  solution  methods  that  can  apply 
to  a  wide  class  of  such  models  and  equations. 

In  FY' 92  versions  of  the  PDEPGMS  were  converted  to  the  "C"  language  using  the  Bell 
Labs  FORTOC  system.   Efforts  to  debug  and  recompile  versions  of  the  software  are  in 
progress.   Preliminary  studies  were  explored  on  adaptation  of  these  methods  for 
moving  boundary  problems  with  an  application  to  acrosomal  growth  in  cells.   Further 
work  in  this  area  will  concentrate  on  numerical  methods  to  solve  a  moving  boundary 
problem  with  an  iterative  refinement  scheme. 
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The  purpose  of  using  moleeular  graphics,  computer  modeling,  and  sequence  analysis  is  to  gain  insight  into 
macromolecular  or  biological  structures.  Using  molccuUir  graphics,  scientists  can  computationally  consu"uct  models  which 
may  be  useful  in  deciding  between  two  or  more  alternative  intcrprctutions  of  biochemical  or  structural  data.  Computer 
modeling  is  often  important  in  understanding  biophysics  or  other  biochemical  relationships  and  how  these  relate  to 
biological  structures.    Sequence  analysis  uses  the  onc-dimcnsional  amino  acid  sequence  of  proteins  together  with  boLh 
Fourier  analysis  and  other  predictive  algorithms  to  attempt  to  identify  parLs  of  the  sequence  which  may  have  a  regular 
structure. 

These  interrelated  computational  methods  are  used  to  extrapolate  known  structural  information  to  predict  useful 
three-dimensional  relationship.  Often,  three-dimensional  structural  information  is  unavailable  orexpenmenially 
intractable.  Two  studies  currendy  in  progress  include  collaborations  with  PSL,  DCRT  to  study  computer  models  of 
biological  or  biomedical  systems,  and  a  collaboration  with  LSBR,  NIAMS  to  predict  the  structure  of  macromolecules. 

Progress  this  year  has  included  studies  involving  computer  mcxicis  of  biopolymcrs,  which  have  yielded  information  on  the 
migration  of  photons  in  nonuniform  media,  and  reaction  related  diffusion  phenomenon,  resulting  in  two  publications.  The 
fct  study  has  been  extended  to  evaluate  the  detection  of  inclusions  hidden  in  tissue  using  photons,  and  has  been  submitted 
for  publication. 
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This  project  uses  image  processing  techniques  to  analyze  electron  micrographs.  In  order  to  answer  important  questions  in 
structural  biology,  it  is  necessary  to  obtain  relatively  high  resolution  two-and  three-dimensional  structural  information 
about  biological  macromolecules.  While  atomic  or  near  atomic  resolution  information  has  traditionally  been  available  by 
x-ray  crystallography  for  some  small  molecules  and  small  proteins,  the  overwhelming  majority  of  biological 
macromolecules  are  not  crystalline,  or  are  too  large  and  ihcrcforc  not  amenable  to  3-D  crystallography. 

Biological  specimens  can,  on  the  other  hand,  be  visualized  m  the  electron  microscope  using  a  number  of  specimen 
preparation  techniques.  Negative  staining  and  shadowing,  which  both  use  heavy  metals,  are  two  traditional  approaches  to 
increasing  contrast  to  show  the  biological  macromoleculc's  structure.  Cryo-electron  microscopy,  a  newer  technique, 
attempts  to  preserve  "native"  structure  by  surrounding  the  specimen  with  a  layer  of  ice.  Collaborative  studies  with  LSBR, 
NIAMS  are  currently  underway  on  a  number  of  such  projects,  whereby  the  electron  micrograph  images  are  computationally 
corrected,  combined,  averaged,  reconstructed,  or  in  some  way  computationally  enhanced  to  improve  the  signal-to-noise  rauo 
or  10  increase  the  interpretability  of  the  structures  being  visualized.    Cryo  images  are  typically  lower  contrast  and  require 
greater  computer  proecessing  to  achieve  satisfactory  resulLs. 

Of  particular  interest  to  our  research  is  the  understanding  of  viral  structures.  At  present  we  are  continuing  our  efforts  to 
investigate  the  structure  of  a  large  animal  virus,  human  herpes  simplex  virus  (lypel).  We  are  in  the  process  of  determining 
the  location  of  the  major  capsid  proteins.  Using  the  three-dimensional  ico.sahcdral  reconsu^uction  technique,  we  apply  the 
symmetry  of  these  virus  particles  both  to  find  the  orientation  of  randomly  oriented  capsid  particles  (in  ice)  and  to  combine 
many  particles  into  a  three-dimensional  reconstruction.  Biological  material  for  these  herpes  reconstructions  is  provided 
through  collaboration  with  researchers  at  the  University  of  Virginia,  Charlottesville.  The  elecu-on  microscopy  is  performed 
in  LSBR,  NIAMS.  Interpretation  of  our  3-D  reconstructions  is  performed  jointly  by  all  collaborators. 
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A  comprehensive  study  of  the  theory  and  application  to  biomedical  research  of  Receiver  Operating 
Characteristic  (ROC)  curves  has  continued.  ROC  analysis  is  used  to  compare  two  diagnostic  or 
laboratory  tests  when  the  data  are  ordinal  categories  or  continuous  variables. 

A  study  of  the  applicability  of  Lomax  distribution  to  ROC  curves  has  continued.  A  paper  with 
Dr.  Ratnaparkhi  on  the  maximum  likelihood  estimation  of  Lomax  models  and  their  iterative  fit  via 
computer  is  under  revision. 

An  important  practical  area  of  ROC  analysis  recently  studied  is  the  incorporation  of  covariate 
information.  A  collaboration  joint  with  Drs.  J.  Norman,  D.  Levy  and  J.  Bailey  has  used  age  and  body 
mass  index  (habitus)  in  a  linear  regression  model  to  improve  fuzzy  ROC  performance  of  an  ECG  test  in 
prediction  of  left  ventricular  hypertrophy. 

The  study  of  the  relationship  between  ROC  curves  and  artificial  neural  networks  has  commenced.  ROC 
curves  are  being  studied  in  the  evaluation  of  the  performance  of  individual  artificial  neural  networks 
(ANN).  Also,  a  study  has  begun  to  use  some  measure  based  on  ROC  curves  such  as  area  or  some 
measure  of  tradeoff  of  different  errors  and  risks  to  optimize  the  performance  of  the  ANN. 

A  manuscript  with  M.  Zweig  (CC)  is  in  preparation  concerning  the  fundamental  role  of  ROC  plots  and 
analyses  in  the  evaluation  of  laboratory  tests  in  cUnical  chemistry  and  pathology. 

A  study  has  begun  of  randomization  tests,  transformations,  and  statistical  inference. 

A  special  topics  session  on  methodology  for  the  evaluation  of  diagnostic  and  laboratory  tests  and  its 
applicablility  to  biomedicine  has  been  organized  for  the  National  Annual  Statistical  Meetings  in  Boston 
in  August,  1992.  Also  work  has  begun  on  a  special  oral  and  written  presentation  for  the  NIH 
Biostatistics  Symposium  to  be  held  in  January,  1993. 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type     Do  not  exceed  the  space  provided ) 

The  Computer  Systems  Laboratory,  in  collaboration  witii  the  Clinical  Branch,  NEI,  has  developed  opthalmic  image 
acquisition  and  processing  systems  based  on  the  Apple  Macintosh  computer  and  extensions  to  IMAGE,  a  powerful  general 
purpose  image  acquisition  and  analysis  program  developed  by  NIMH. 

As  reported  previously,  CSL  has  completed  a  systcin  to  quantitaie  lens  opacities  (cataracts)  from  images  produced  by  the 
Scheimpflug  Slitlamp  Camera  (SLC),  which  produces  an  image  of  the  eye  along  the  optical  axis.  These  images  are  used 
to  visualize  pathological  changes  in  the  ocular  media,  p:irticularly  the  lens.  The  pnncipal  research  goal  is  to  accurately 
measure  changes  in  cataract  density  and  thereby  assess  the  efficacy  of  anti-cataract  drugs.  CSL-written  extensions  to  the 
IMAGE  software  are  used  to  check  exposure  levels,  automatically  locate  lens  boundaries  and  compute  densities  for  three 
regions  of  interest  within  the  lens  structure. 

NEI  investigators  also  visualize  cataracts  with  a  reiroillumination  system,  projecting  light  onto  the  retina,  which  reflects 
the  light  back  through  the  lens.  The  frontal  plane  images  are  captured  with  video  instrumentation.  CSL  is  evaluating 
software  for  the  removal  of  an  optical  distortion  patiern,  which  must  be  clinimaied  before  morphological  or  densitomecnc 
measurements  can  be  made  on  the  images. 

In  FY93  CSL  also  will  be  evaluating  video  image  capture  of  the  corneal  endothelial  cells  via  specular  imaging,  along  with 
possible  routines  for  shape  analysis  of  the  corneal  cells.  Additional  support  will  be  offered  for  ins  trasilluminaiion  defect 
studies  and  for  2D  gel  studies.  The  iris  defect  studies  may  utili/e  infr:ircd  imaging  to  reduce  the  light  irritation  for  the 
patient.  Adjustments  to  area  measurements  on  the  diseased  ins  must  be  made.  CSL  will  also  assist  in  the  evaluation  of 
software  to  properly  analyze  2D  coomassie  blue  gels  containing  lens  proteins. 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type.    Do  not  exceed  the  space  provided.) 

The  superposition  and  registration  of  differing  tomographic  views  is  a  difficult  problem  for  investigators  attempting  to 
correlate  brain  form  (structure),  derived  from  x-ray  computed  tomography  (CT)  images,  and  brain  function  (metabolism), 
revealed  by  nuclear  medicine  positron  emission  tomography  (PET)  images. 

It  is  hoped  tliat  development  of  techniques  for  the  accurate  correlation  of  CT  structural  data  with  PET  metabolic 
information  will  enhance  our  understanding  of  the  processes  underlying  the  generation  of  PET  images. 

Our  approach  has  three  suiges:  first,  practical  mcthcKls  must  be  discovered  for  the  accurate  and  reproducible  placement  of 
the  head  within  a  tomographic  scanner's  aperture.  Second,  techniques  for  monitoring  head  position  during  the  image 
acquisition  process  must  be  developed  to  correct  for  head  movement  before  the  image  is  generated.  Third,  simplilied 
algorithms  must  be  found  for  scaling  and  registering  digiii/ed  images  from  different  scanners  on  a  digital  display 
subsystem. 

Precise  orientation  of  the  subject's  skull  within  the  scanner's  aperture  is  monitored  and  recorded  through  the  use  of  a 
PC-based  Polhemus  position/orienution  measurement  subsystem,  allowing  simuluineous  use  of  two  independent  sensors. 
The  development  of  two  inexpensive  custom-molded  oral  appliances  allows  the  Polhemus  subsystem's  sensor  to  be  fixed 
to  the  subject's  skull.  A  novel  uirgeting  algonthm  was  derived  to  provide  visual  cues  related  to  head  position  within  a 
scanner's  imaging  volume  to  the  system  operator.  Twt>-sensor  software  was  completed,  and  extensive  evaluation  has 
begun  prior  to  its  expenmental  use  with  test  subjects. 

An  additional  position/orientation  ineasuremcni  subsystem  has  been  obtained  and  evaluated  for  linearity  and  for  sensitivity 
to  nearby  metallic  objects,  a  problem  common  to  all  elccu-omagnctic-based  racking  systems.  This  device's  utilization  ot 
quasi-static  fields  was  designed  to  increase  its  immunity  to  certain  types  of  metal  in  close  proximity.  Although 
performance  of  this  new  position  measurement  system  was  good,  it  did  not  outperform  the  Polhemus  system  in  ilie 
presence  of  PET  scanners. 
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The  Distributed  Systems  Section  (DSS)  of  the  Computer  Systems  Laboratory  and  the  Laboratory  Systems  Unit  (LSU)  of 
the  Computer  Center  Branch  are  working  jointly  on  a  project  to  provide  support  for  researchers  with  high-performance 
UNIX  workstations  manufactured  by  a  variety  of  vendors. 

The  workstations  are  interconnected  by  Lhe  NIH  campus-widc  LAN,  by  which  they  share  resources  and  access  services  such 
as  file  backup  and  archiving,  software  maintenance,  applications  software,  online  documentation,  nationwide  electronic 
mail  and  news,  computation  and  database  servers,  laser  printers,  and  a  national  disu"ibu[cd  file  system.  Applicatons  lor 
Advanced  Laboratory  Workstauons  (ALWs)  include  molecukir  graphics  and  modeling,  medical  image  processing,  searching, 
statistical  analysis,  laboratory  data  acquisition,  and  desktop  publishing. 

Use  of  ALW  systems  grew  to  about  150  workstations  and  over  300  registered  users  in  the  following  ICDs:  CC,  DCRT, 
NCI,  NCRR,  NIA,  NIAID,  NIAMS,  NICHD,  NIDCD,  NIDDK,  NI.MH,  NINDS,  and  NHLBl.  Nine  file  servers  suppon 
over  80GB  of  disk  space. 

We  have  begun  a  collaboration  with  the  NCRR  and  the  DRRP  on  the  Multimodality  Radiological  Image  Processing 
System  (MRIPS).  MRIPS  will  enable  NIH  investigators  to  register  and  visualize  2D  and  3D  medical  images  acquired  by 
various  means.  The  ALW  Project  supports  the  UNIX  worksu.itions  and  AFS  lileservers  for  this  project. 

We  continued  collaborating  with  the  NIA  on  an  image  processing  project  which  involves  using  ALWs  to  perform  two 
tasks:  (1)  registration  of  PET  and  MRI  three  dimensional  images  for  the  analysis  of  funtional  anatomy,  and  (2)  rescaling  ot 
MRI  images  to  the  dimensions  of  a  standard  brain  atlas. 

Other  major  users  of  ALW  technology  are  the  Biological  Computation  Facility  (BCF)  of  RBMB,  NINDS,  the  Laboratory 
of  Chemical  Physics  and  Laboratory  of  Molecular  Biology,  NIDDK,  the  CSL  Computational  Science  and  Engineering 
Section,  and  the  Computer  Networks  Branch,  DCRT. 
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These  studies  investigate  mathematical  models  of  the  immune  network  and  the 
Kinetics  of  its  many  complex  interacting  components,  viz.  precursors,  CD4+  helper 
T-lymphocytes,  CD8+  cytotoxic  T-lymphocytes,  natural  killer  (NK)  and  lymphokine 
activated  killer  (LAK)  monocytes,  interleukins,  and  interferons  by  means  of  a 
system  of  nonlinearly  coupled,  ordinary  differential  equations.   An  appropriately 
constructed  and  validated  network  model  should  suggest  experiments  and  theoretic 
rationale  that  can  guide  the  use  of  therapeutic  interventions  and  vaccines  and 
promote  understanding  of  how  the  immune  system  might  be  manipulated  to  increase  its 
effectiveness  in  preventing  or  combating  pathogenic  infections. 

Current  studies  involve  the  interaction  of  the  human  immunodeficiency  virus 
with  CD4+  helper  T-lymphocytes  (T4  cells)  in  culture  media.   These  simulation 
studies  support  the  Zigury  cytopathology  model  which  postulates  that  each  antigenic 
stimulation  amplifies  the  presence  of  the  infection  by  the  conversion  (activation) 
of  large  numbers  of  newly  infected  T4  cells;  these  T4  cells  then  express  new  virus, 
lyze,  and  die,  but  leave  behind  free  virus  that  expands  the  infection  of  the 
precursor  population.   Thus,  the  T4  cell  population  is  left  in  an  increased  state 
of  infection  after  each  occurrence  of  antigenic  stimulation.   The  loss  of  infected 
T4  cells  and  precursors  by  viral  destruction  reduces  the  capacity  of  the  immune 
system  to  respond  to  new  antigens  or  antigens  previously  encountered. 

Current  simulation  studies  attempt  to  deal  with  in  vitro  systems  of  cells  and 
virus  in  culture  media  where  data  would  be  feasibly  obtained  by  experiment.   Hence 
these  studies  differ  from  other  published  simulation  studies  that  have  attempted  to 
deal  with  the  entire  in  vivo  immune  system  and  must  make  unrealistic  assumptions 
about  its  character  in  order  to  reduce  the  considerable  number  of  undetermined 
interaction  coefficients. 
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This  highly  interdisciplinary  project  is  dedicated  to  the  theoretical  study  and  practical  feasibiUty  of 
biomedical  applications  of  recendy  developed  statistical  decision  procedures  that  operate  on  data  known 
to  be  dominated  by  quantum-mechanical  noise. 

The  methodology  has  been  successfully  used  over  the  last  decade  by  electrical  communications 
engineers,  particularly  those  involved  with  quantum  optics  systems,  and  allows  the  statistician,  for  the 
first  time,  the  opportunity  to  undertake  nearly-classical  statistical  decision  theory  on  data  that,  for 
example,  are  known  to  have,  in  principle^  no  joint  distribution. 

This  latter,  highly  non-intuitive  fact,  and  others  that  are  equally  well  experimentally  established,  all 
have  fundamental  consequences  for  how  statisticians  must  re-think  the  planning  of  experiments  and  data 
anlayses  for  processes  occurring  at  the  molecular  level.  This  is  especially  important,  for  example,  when 
the  crucial  experiments  require  low  power  levels  so  as  to  not  distort  the  true  underlying  biological 
processes  (in  which  case  not  much  data  can  be  collected)  or  when  important,  but  rare,  events  or  markers 
must  be  sorted  out  from  other  sources  of  noise,  including  quantum  noise. 

Of  special  biomedical  interest  are  novel  pairings  of  this  quantum-consistent  statistical  theory  and 
technologies  in  the  rapidly  developing  field  of  Ught-based  imaging  devices  and  detection  methods,  and 
analysis  techniques  using  other  forms  of  radiation.  Possible  biomedical  applications  thus  could  include: 
reduced-dose  PET  scans;  real-time,  laser-based,  reduced-illumination  confocal  microscopy  of 
living  cells  and  tissue.  Also  of  interest  are  applications  that  involve  bioluminescent  molecular 
tagging  and  enhanced  chemiluminescence,  which  would  allow  the  non-invasive, 
non-destructive  study  of  biological  processes  at  the  level  of  individual  molecules  and  atoms. 
Moreover  any  of  the  relatively  new  fields  of  molecular  electronics  and  electronics  using  super-lattices 
and  quantum  wells,  and  the  manipulation  of  isolated  trapped  electons,  ions,  atoms,  molecules  or 
biological  organisms,  may  provide  the  initial  experimental  contexts  for  optimal  statistical  estimation  and 
decision  making  in  the  presence  of  quantum  noise. 

A  research  monograph  presenting  our  results  is  under  review  by  Springer- Verlag  Publishers,  Inc., 
for  their  Lecture  Notes  in  Statistics  Series,  while  a  shortened  version  of  the  paper  has  been  accepted  for 
publication  by  the  journal  Statistical  Science  as  a  Special  Invited  Paper. 
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The  large  dataset  sizes  of  medical  images,  an  important  component  of  the  medical  record  generated  during  a  patient's 
hospital  stay  or  clinic  visit,  unfortunately  represent  a  difficult-to-manage  data  source.    In  attempting  to  consolidate  medical 
images  with  conventional  textual  medical  record  data  in  the  Medical  Information  System  (MIS),  the  NIH  Clinical  Center 
(CC)  is  pursuing  the  goal  of  creating  a  comprehensive  electronic  medical  record.  Toward  this  end,  DCRT  and  the  CC  are 
collaborating  in  a  series  of  demonstration  projects  explonng  image  integrauon  into  electronic  medical  records.  Images  of 
interest  range  in  size  from  16  Kbytes  (diagnostic  electrocardiograms)  to  256  Kbytes  (tomographic  scans)  to  4  Mbytes 
(conventional  film  Xrays). 

Standard  12-Iead  electrocardiograms  (ECO)  are  automatically  acquired,  interpreted,  and  stored  in  digital  form  on  an  ECG 
Data  Management  System  in  the  CC.  A  remote  ECG  workstation  is  being  developed  as  a  serial  RS-232  gateway  to 
transfer  ECG  waveforms  and  their  related  diagnoses  from  this  system  to  the  MIS.  Since  ECG  waveforms  are  essentially 
binary  images  in  which  the  black  pixel  content  is  only  ca.  0.1%,  ECG  waveform  data  are  more  efficiently  stored  and 
transmitted  as  time -ordered  lists  of  lO-bii  ECG  amplitudes  rather  than  as  2.75K  X  3K  pixel  images. 

Chest  Xrays  routinely  obtained  within  the  Diagnostic  Radiology  Department  are  appropriate  for  integration  into  the  MIS 
as  well  as  for  transmission  to  the  relevant  outpatient  clinic.  A  recendy  acquired  Vision  Ten  RITA  !  system,  containing  a 
gray-scale  sheet  film  digitizer,  and  two  R  ITA  .'-compatible  image  display  systems,  are  integral  parts  of  the  image  gateway. 
Communication  of  medical  images  between  the  Radiology  Department  Film  Library  and  remote  sites  is  now  possible  over 
the  CC  fiber  optic  network.  The  NHLBI  Cardiac  Surgical  Clinic  was  the  first  outpatient  clinic  to  routinely  use  chest  films 
transmitted  over  this  Ethernet  pathway. 

Future  plans  include  the  connection  of  two  CT  .scanners  to  this  system  via  ACR-NEMA  communication  links  to  dedicated 
image  servers  added  to  the  teleradiology  network. 
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The  goals  of  the  high  pertormance  biomedical  computing  program  are  to  identify  and  solve  those  computational  problems 
in  biomedicine  that  can  benefit  from  high  performance  hardware,  modem  software  engineenng  principles,  and  efficient 
algorithms.  This  effort  includes  providing  high  performance  parallel  computer  systems  for  the  NIH  staff  and  developing 
parallel  algorithms  for  biomedical  applications. 

CSL  is  developing  algorithms  for  a  number  of  biomedical  applications  that  can  benefit  from  computational  speedup 
including  image  processing  of  electron  micrographs,  protein  and  nucleic  acid  sequence  analysis,  nuclear  magnetic  resonance 
spectroscopy,  x-ray  crystallography,  protein  folding  prediction,  quantum  chemical  methods,  molecular  dynamics 
simulations,  human  genetic  linkage  analysis,  medical  imaging,  and  radiation  treatment  planning.  Development  teams  for 
each  application  area  include  computer  engineers  and  scientisLs  from  CSL  who  design  and  implement  the  required  parallel 
algonthms  and  methods,  and  biomedical  scientists  who  provide  the  necessary  application  knowledge  and  become  users  of 
the  developed  software.  The  ultimate  goal  is  to  have  high  performance  parallel  computing  facilitate  the  science  that  is  done 
at  NIH.  While  developing  these  computationally  demanding  applications,  CSL  is  investigating  the  following  high 
performance  computing  issues:  partitioning  a  problem  into  many  parts  that  can  be  independently  executed  on  different 
processors,  designing  algonthms  so  that  delays  of  interprocessor  communication  can  be  kept  to  a  small  fraction  of  the 
computation  time,  designing  the  parts  so  that  the  computing  load  can  be  distributed  evenly  over  the  available  processors  or 
dynamically  balanced,  designing  algonthms  so  that  the  number  of  processors  is  a  parameter  and  the  algorithms  can  be 
configured  dynamically  for  the  available  machine,  developing  tools  and  environments  for  producing  portable  parallel 
programs  and  monitoring  system  performance,  and  proving  that  a  parallel  algorithm  on  a  given  machine  meets  its 
specifications. 
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Over  the  past  20  years  the  Molecular  Disease  Branch  (MDB)  ol  the  NHLBl  has  studied  human  lipid  metabolism  disorders 
by  analyzing  tens  of  thousands  of  blood  samples  from  nearly  7,{)(K)  individuals.  Until  quite  recently,  all  of  the 
accumulated  daia  were  gathered  and  entered  entirely  by  hand  into  ccnu"al  NIH  Computer  Utility  databases. 

As  reported  in  FY91,  the  Lipid  Analysis  Sample  Tracking  System  (LASTS)  is  a  comprehensive  PC-based  system  for 
recording  the  results  of  lipid  analyses  performed  on  plasma  samples.  Identifymg  information  about  the  samples  is  entered 
into  databases  maintained  on  a  laboratory  PC  and  verified  when  the  sample  is  acquired.  The  samples,  identified  by 
bar-coded  labels,  are  subdivided  for  analysis.  The  system  maintains  records  of  the  number  of  samples  awaiung  each  type  of 
analysis,  scheduling  appropriate  test  runs  when  a  sufficient  number  of  samples  have  accumulated. 

As  each  analysis  is  performed,  the  results  arc  either  captured  directly  from  the  colorimetnc  analy/er  or  keyed  by  the 
bar-coded  label  for  manual  entry.  The  results  to  date  on  each  sample  ;ire  maintained  in  a  database  that  can  be  searched  by 
laboratory  personnel  or  the  referring  physician.  Once  the  validity  of  tlie  test  results  has  been  certified,  the  sample  data  are 
copied  to  a  report  datasei  that  is  then  u-anslcred  to  the  NIH  Central  Computer  Utility  and  incorporated  into  the  MDB  hpid 
study  databases. 

Verified  data  are  also  maintained  locally  in  a  form  suiUible  for  access  by  PC-based  database  query  programs.  Statistics 
about  conu'ols  and  standards  are  also  maintiiined. 

The  LASTS  system  is  now  in  full  production  use.  The  time  required  to  log  and  schedule  samples  has  been  drastically 
decreased  and,  more  importantly,  the  investigators  have  found  that  the  improved  access  to  their  results  has  enabled  them  to 
think  about  their  e.xpcnmenLs  in  completely  new  ways. 


PHS  6040  (Hev  5.92) 


„  '  PROJECT  NUMBER 

DEPARTMENT  Of  HEALTH  AND  HUMAN  SERVICES  •  PUBLIC  HEALTH  SERVCE 


NOTICE    OF    INTRAMURAL    RESEARCH    PROJECT 


ZOl  CT(X)203-03  CSL 


PERIOD  COVERED 

October  1,  1991  to  September  30,  1992 


TITLE  OF  PROJECT      (80  characters  or  less.    Tide  musl  fit  on  one  line  between  the  borders.) 

Diode  Array  Spectrophotometer 


PRINCIPAL  INVESTIGATOR  (List  other  prolessional  personnel  below  the  Principal  Investigator.)  (Name,  title,  laboratory,  and  institute  affiliation) 

Computer  Systems  Analyst  CSL,  OCRT 

Chief,  Laboratory  and  Clinical  Sys.  Sec.  CSL,  DCRT 

Supv.  Electronics  Engmeer  BEIP,  NCRR 

Electronics  Engineer  BEIP,  NCRR 

Physicist  BEIP,  NCRR 

Chief,  Lab  of  Cell  Biology  IR,  L,  NHLBl 


PI: 

H.  A.  Frednck-son 

Others: 

A.  R.  Schultz 

W.  Fnauf 

J.  Cole 

P.  Smith 

R.  Hendler 

COOPERATING  UNITS  (il  any) 


LAB/BRANCH     Computer  Systcms  Laboratory 


SECTION         Laboratory  and  Clinical  Systems  Section 


INSTITUTE  AND  LOCATION    jj^RT,  NTH,  Bethesda  MD  20892 


TOTAL  STAFF  YEARS: 


PROFESSIONAL:     5  |    OTHER: 


CHECK  APPROPRIATE  BOX(ES) 

n    (a)  Human  subjects              D    (b)  Human  tissues           EH       (c)  Neither 
[J  (a1)  Minors 
Zl  (a2)  Interviews 
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A  computer-controlled  100-channel  High  Speed  Diode  Array  Spectrophotometer  has  been  developed  by  the  Biomedical 
Engineering  and  Instrumentation  Program  of  the  National  Center  for  Research  Resources  and  CSL  for  the  Laboratory  of 
Cell  Biology  (LCB),  NHLBl.  It  will  be  used  to  obtam  more  complete  spectral  information  about  the  rapid  changes  of  the 
reduction  and  oxidation  centers  within  the  protem  enzyme  cytochrome  oxidase.  This  enzyme  is  mvolved  in  cellular 
respiration  and  is  located  within  the  inner  lipid  bilayer  of  the  mitochondrion. 

The  electronic  hardware  consists  of  two  48-element  photodiode  arrays,  with  each  element  connected  to  a  discrete  A/D 
converter  and,  subsequendy ,  to  local  storage  channels.  Each  channel  is  capable  of  acquiring  data  at  10  microsecond 
intervals.  A  fast  personal  computer  (PC)  is  used  to  control  the  spectrophotometer.  Both  timing  control  signals 
transmitted  from  the  PC  and  the  received  data  from  the  A/D  channels  are  conveyed  through  a  40-bit  parallel  interface. 

The  Computer  Systems  Laboratory  assisted  in  developing  the  PC  interface  to  the  spectrophotometer  and  in  creating  the 
data  acquisition  and  control  software.  It  is  anticipated  that  the  commercial  data  manipulation  language  MLAB  will  be  used 
for  analysis  of  the  data. 

The  spectrophotometer  has  been  built  and  delivered  to  NHLBL  Laboratory  testing  of  both  the  hardware  and  the  software 
systems  has  been  completed. 


PHS  6040  (Rev.  5/92) 


:  PRO-ECr  NUMBER 
DEPARTMENT  OF  HEALTH  AND  HUMAN  SERVICES  •  P'JBLiC  HEALTH  SERVCE 


NOTICE    OF    INTRAMURAL    RESEARCH    PROJECT 


ZOl  CT00204-03  CSL 


PERIOD  COVERED 

October  1,  1991  to  September  30,  1992 


TITLE  Of  PROJECT      (80  characters  or  less.    Title  must  lit  on  one  line  between  the  borders.) 

Computer  Assisted  Patient  Interviewing  in  Clinical  Pharmacy 


PRINCIPAL  INVESTIGATOR  (List  other  protessional  personnel  below  the  Principal  Investigator  I  (Name,  title   laboratory,  and  institute  atliliationj 

PI:         J.  M.  DeLeo  Computer  Systems  Analyst  CSL,  DCRT 

Others:    p.  Pucmo  Pharmacist  PHAR,  CC 

K.  Calis  Pharmacist  PHAR,  CC 


COOPERATING  UNITS  (it  any) 


LAB/BRANCH     Computer  Systems  Laboratory 


SECTION  Laboratory  and  Clinical  Systems  Section 


INSTITUTE  AND  LOCATION    DCRT,  NIH,  Bethcsda,  MD  20892 


TOTAL  STAFF  YEARS 


PROFESSIONAL      7  -    OTHER: 


CHECK  APPROPRIATE  BOX(ES) 

LJ    (a)  Human  subjects  D    (b)  Human  tissues  EH       (c)  Neither 

□   (a1)  Minors 

'U  (a2)  Interviews 
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As  clinical  pharmacists  become  more  involved  in  direct  patient  care  by  dispensing  medication  informatjon  and  helping  to 
identify  potential  drug  interaction-induced  health  hazards,  they  must  keep  abreast  of  new  drug  informatJon,  allocate  more 
one-on-one  time  for  patients,  and  maintain  effective  interviewing  skills.  In  support  of  these  activities,  CSL  began  a 
collaboration  in  1990  with  the  NIH  Clinical  Center  Pharmacy  Dcparunent. 

The  objective  was  to  develop  a  computer  interviewing  system  that  collects  medication  histories,  dispenses  medication 
information  to  patients,  and  detects  possible  untoward  events  related  to  medication  regimens,  thereby  making  more 
pharmacist  time  available  for  patients  who  are  not  candidates  lor  computer  interviewing.  Warnings  generated  by  this 
system  could  aid  in  focusing  the  pharmacist-patient  interaction. 

The  initial  version  of  the  medication  history  system  developed,  tested,  and  reported  in  FY91 ,  included  system  interview 
scripts  for  patient  demographics,  health  history,  and  drug  usage.  A  concise  summary  report  is  produced  after  the  interview. 

In  FY92  the  interview  scripts  and  adverse  drug  reaction  (ADR)  thesaurus  were  refined  and  a  comprehensive  report  descnbing 
ADR's  sorted  by  body  systems  was  prepared  and  submitted  to  USP  to  assist  in  standardizing  future  ADR  terminology. 
Database  modules  were  designed  to  support  adverse  drug  reaction  and  drug-drug  interacuon  detection  and  programs  were 
created  for  patient  drug  information  and  physician  drug  moniionng  information  rcLrieval.  Also,  a  neural  network  module 
for  adverse  drug  reaction  detection  was  developed  that  demonsu-ated  improvement  over  a  previous  multi-level  system 
evaluauon  scheme. 

Completion  and  integration  of  all  program  malules  and  database  components,  and  miiiation  of  formal  testing  and 
evaluation  of  the  completed  system  i\rc  planned  for  FY9.v 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type.   Do  not  exceed  the  space  provided.) 

Computer  Systems  Laboratory  (CSL)  has  integrated  laboratory  acquisition  and  data  processing  computer  systems  m  the 
Receptor  Biochemistry  and  Molecular  Biology  Branch  (RBMB),  NINDS  .  Seven  automated  DNA  sequencers  pnxiuce  168 
sequence  fragments  or  about  60,000  bases  of  sequence  per  day.  We  have  specified,  installed  and  administered  the  computer 
facilities  used  for  analyzing  and  archiving  this  large  volume  of  DNA  sequence  data. 

In  FY92  CSL  provided  system  management,  programming  and  database  support  for  the  cDNA  Project,  funded  in  part  by 
the  Department  of  Energy  Center  for  Disease  Control,  that  seeks  to  identify  and  characterize  the  nearly  30,000  genes 
expressed  in  the  human  brain,  producing  expressed  sequence  tags  (ESTs)  from  human  brain  cDNAs  at  the  rate  of  -100 
templates/day.  Over  6,000  EST  sequences  have  been  pubUshed  and  submitted  to  GenBank.  Similarly,  the  goal  of  the 
smallpox  project,  a  collaboration  with  the  Deparunent  of  Energy  Centers  for  Disease  Conffol  (DECDC),  is  to  sequence 
completely  the  170  kilo-base  smallpox  genome  pnor  to  its  destruction  in  December,  1993.  Genome  sequencing  and 
assembly  is  nearly  complete  and  analysis  of  the  finished  sequence  is  underway. 

System  management  assistance  included  daily  operations  support  for  this  high-volume  production  system,  specification  and 
initiation  of  system  backup  procedures,  and  installation  and  testing  of  commercial  and  public  domain  software.  A  set  of 
programs  was  created  to  batch  process  ESTs  for  database  searching.  A  protein  library  motif  companson  program  was 
ported  to  the  CSL  parallel  computer. 

CSL  is  purchasing  a  special  purpose  sequence  assembly  and  analysis  system  based  on  a  high  speed  text  search 
computational  engine  to  be  made  available  to  the  NIH  intramural  community  as  a  shared  resource  connected  to  the  NIH 
network. 

During  FY92,  emphasis  has  continued  to  shift  from  sequence  assembly  to  sequence  analysis  and  data  archival/retrieval. 
CSL  will  continue  to  concentrate  on  methods  for  processmg  and  archiving  extremely  large  volumes  of  sequence  data,  and 
help  to  make  technologies  developed  through  this  project  available  to  other  investigators  at  NIH. 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type     Do  not  exceed  the  space  provided.) 

The  Laboratory  Analysis  Package  (LAP)  program  was  originally  developed  to  run  on  SLJN3  UNIX  workstations  as  a 
general-purpose  tool  for  boLh  interactive  and  batch  processing  of  laboratory  data.  LAP  has  been  ported  to  SUN4, 
VAX/VMS,  and  Convex  architectures.  It  is  used  extensively  by  two  laboratories  m  NIDDK  and  at  several  flow  cytometry 
sites  at  NIH. 

LAP  can  perform  a  wide  range  of  data  manipulation  on  vector  data,  x-y  paired  data,  and  matnx  data  using  either  a  command 
or  expression  syntax.    Customized  command  procedures  can  be  saved  in  files  and  added  to  the  LAP  command  set.  Results 
may  be  viewed  as  line  graphs,  scatter,  or  bar  graphs,  perspective  views,  or  contours  in  color  or  monochrome  viewports 
within  an  X  window  or  on  Tektronix  4010  and  4107  compatible  terminals.  Publication  quality  plots  may  be  produced  in 
several  formats,  including  encapsulated  PostScript  and  HPGL. 

During  FY92,  LAP  was  upgraded  from  C+-i-  version  1  to  version  2  under  VAX/VMS  and  ported  to  the  NIH  Convex 
computer  system.  A  Reference  Guide  documenting  LAP  commands  and  two  User's  Guides,  one  for  general  use  and  one 
tailored  to  flow  cytomeLry  users,  have  been  pro\  idcd. 
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SUMMARY  OF  WORK  (Use  standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

The  project  objective  is  to  develop  campus-wide  access  via  wide  area  network  (WAN) 
to  library  information  systems.   The  knowledge  and  experience  will  provide  the 
basis  of  advice  given  to  other  NIH  components.   The  project  to  provide  users  with 
network  access  to  DCRT  Library  information  systems  began  in  1988,  with  remote 
access  to  the  online  catalog  and  various  library  information  files.   In  April, 
1989,  NIH  Library  staff  visited  the  DCRT  Library  to  review  LAN  installations. 
Investigations  and  testing  of  networking  solutions  for  CD  ROM  (compact  disk  read 
only  memory)  information  systems  led  to  installation  of  OPTI-NET,  in  April,  1989. 
An  internetworking  solution  permitted  users  on  the  DCRT  LAN  in  four  buildings  to 
access  systems  on  the  Library  LAN.   In  1990,  network  licensing  was  arranged  for: 
Computer  Select,  Microsoft  Programmer's  Library,  and  Lotus  Prompt.   Library  staff 
addressed  the  March  3Com  CURE  (network  users  group)  to  present  experiences  and  to 
demonstrate  Computer  Select.  In  February,  1991,  the  Library  catalog,  various 
information  files,  and  CD  ROM  publications  were  migrated  to  PUBnet,  the  campus-wide 
public  3Com  network  on  NIHnet.   OPTI-NET  was  later  replaced  by  a  C.B.I.S.  optical 
server  and  system.   Users  on  over  250  3Com  networks  throughout  NIH  can  access 
PUBnet  information  from  their  offices  and  their  own  workstations.   Four  CD  ROM 
publications  are  available  on  PUBnet. 

During  this  past  year,  we  began  to  explore  campus-wide  information  dissemination 
via  the  local  Gopher  system  on  the  DCRT  Convex.   A  demonstration  project  began, 
using  the  1991  Current  Index  to  Statistics  database.   In  consultation  with  NIH 
statisticians,  a  new  retrieval  system  will  be  designed. 

In  addition  to  answering  queries  and  providing  brief  consultations  on  networking  CD 
ROMs  and  the  selection  and  purchase  of  hardware  and  software.  Library  and  PCB  staff 
met  with  NIH  Library  staff  to  review  experiences  and  to  answer  questions  regarding 
their  future  plans  to  network  these  electronic  publications.   Testing  of  various 
electronic  publications  will  continue  on  the  DCRT  LAN,  PUBnet,  and  Gopher. 
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In  FY91  we  reported  preliminary  expenences  in  developing  a  back-error  propagation  Aruficial  Neural  Network  (ANN)  to 
predict  survival  with  a  small  (170  cases)  breast  cancer  database  using  as  explanatory  covanates  tumor  size,  number  of 
positive  lymph  nodes,  histologic  grade,  and  estrogen  and  progesterone  receptor  status.  We  concluded  that  individual  patient 
survival  curves  could  be  computed  and  that  bootstrapping  methods  coiJd  be  used  to  compute  survival  confidence  intervals 
with  larger  databases. 

Progress  in  FY92  included  a  comparison  of  the  Cox  regression  model  with  an  ANN  approach  to  survival  analysis,  leading 
to  the  conclusion  that  an  ANN  approach  would  not  be  constrained  by  the  proportional  hazards  assumption  of  the  Cox 
model,  thus  suggesting  factors  for  which  predictive  associations  are  not  yet  known. 

A  collaboration  was  begun  with  Dr.  Donald  Henson  (NCI),  present  chairman  of  the  American  Joint  Committee  on  Cancer 
(AJCC).  Dr.  Henson  and  the  AJCC  are  very  interested  in  the  application  of  appropriate  compuung  methodologies  to 
prognostic  factors  for  evaluation  and  use  in  patient  outcome  prediction  and  management. 

A  program  was  written  for  computing  group  actuarial  survival  functions  based  on  the  assumption  of  equal  interval  hazard 
rates  for  censored  and  non-censored  events. 

Studies  were  conducted  with  data  extracted  from  the  NCI  Surveillance,  Epidemiology,  End  Results  (SEER)  program.  A 
6,000-case  melanoma  database  was  used  to  demonstrate  basic  concepts  in  ANN  survival  prediction.  The  back-error 
propagation  ANN  developed  last  year  was  refined  and  applied  to  a.44,000-case  breast  cancer  database  to  demonstrate  ANN 
survival  prediction  based  on  multi-explanatory  factors. 

Presentations  of  methods  and  results  were  made  at  the  AJCC  annual  Meeting  in  January  1992  in  San  Diego  and  before  the 
AJCC  Task  Force  on  Multiple  Prognostic  Factors  in  Chicago  in  June  1992. 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type.    Do  not  exceed  the  space  provided ) 

The  NIH  research  community  has  a  wide  range  of  image  processing  needs  for  which  individual  purchase  of  equipment  and 
development  of  software  arc  neither  cost  nor  time  effective.  Despite  these  constraints,  their  research  may  still  benefit  from 
image  processing.  CSL  is  attempting  to  fill  this  gap  by  offering  access  to  high  quality  image  acquisition  hardware  and 
software,  and  offering  meaningful  support  in  applying  these  image  processing  tools.  To  ensure  proper  results,  the 
problems  of  quantization  levels,  resolution,  spectral  filtering  and  the  complexities  of  software  must  be  understood,  and 
correct  technological  solutions  applied. 

After  examining  various  imaging  hardware  and  systems,  CSL  assembled  a  high-resolution,  wide  dynamic  range  CCD 
camera  system  from  the  best  available  components.  The  system  also  includes  a  light  box,  Macintosh  Ilfx  computer, 
wavelength  selective  filters  and  density  step  uiblcts.  The  camera  is  a  Photometries  CCD  with  a  thermoeleccrically  cooled 
Kodak  CCD  image  sensor.  The  camera  can  form  images  with  a  maximum  1320  x  1035  spatial  resolution  and  a  pixel 
resolution  of  12  bits.  Software  includes  the  NIH  IMAGE  program,  fPLAB,  and  Digital  Image  Processing  Station.  The 
entire  system  is  now  accessible  to  the  NIH  scientific  community  in  the  DCRT  Scientific  Computing  Resource  Center 
users  area. 

Image  processing  support  has  been  extended  to  include  an  NIH  training  semin;ir  and  a  technical  supplement  document  for 
the  public  domain  NIH  IM.AGE  program.  The  semin:tr  and  manual  are  intended  for  those  interested  in  the  source  code  and 
macro  developer  feature  of  the  IMAGE  program.  The  technical  supplement,  now  distributed  with  the  IMAGE  Pascal 
source  code,  descnbes  how  images  can  be  accessed  and  modified.  It  also  includes  information  about  how  textual 
information  relating  to  user-defined  processing  can  be  saved. 
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Laboratory  of  Applied  Studies 
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Applied  Mathematics  Section 


INSTITUTE  AND  LOCATION 

DCRT,  NIH,  Bethesda,  MD   20892 


TOTAL  STAFF  YEARS: 
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OTHER: 
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CHECK  APPROPRIATE  BOXIES) 

D   (a)  Human  subjects     D   (b)  Human  tissues 
D   (a1)  Minors 
D   (a2)  Interviews 


(c)  Neither 


SUMMARY  OF  WORK  (Use  standard  unreduced  type.  Do  not  exceed  the  space  provided.! 

The  objective  of  this  project  is  to  acquire,  make  available,  and  support 
mathematical  modeling  and  data  analysis  software  systems  that  are  accessible  to 
investigators  from  many  disciplines.   The  principals  involved  develop,  test  and 
implement  such  software  on  DCRT-supported  computing  platforms.   They  develop 
supplementary  software  and  utilities  to  optimize  the  use  and  efficiency  of  such 
software  systems.   In  addition,  consultation  and  training  are  made  available  both 
through  formal  DCRT-sponsored  courses  and  through  individual  consultations. 

In  FY' 92,  all  DCRT  versions  of  PC  MLAB  were  upgraded  to  the  current  versions  and  a 
course  on  its  use  was  taught.   Consultations  were  made  to  several  IRP  investigators 
and  software  exchange  and  upgrades  were  carried  out  in  cooperation  with  the 
Computer  Systems  Laboratory.   The  tutorial  manual  was  updated  to  reflect  the  new 
MLAB  features  and  copies  were  distributed  to  class  attendees. 

New  smaller  versions  of  PC  MLAB  are  available  at  reduced  cost  ($1,000  per  copy)  and 
the  software  retains  most  of  the  important  features  of  the  fully  developed  version. 
These  features  include  least  squares  model  fitting  of  functional  models  and  models 
represented  by  systems  of  ordinary  differential  equations.   This  version  should  be 
satisfactory  for  the  majority  of  laboratory  applications.   When  a  model  exceeds  the 
capacity  of  the  reduced  version,  full  capability  versions  are  available  for  NIH 
investigator  use  in  the  DCRT  Scientific  Computing  Resource  Center  (SCRC)  or  in  an 
auxiliary  facility.   Investigators  should  contact  the  SCRC  for  more  information. 
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Statistical  and  Computational  Methods  for  Molecular  Biology,  DNA  Sequence  and  Protein  Stnicture 
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Dept.  Innere  Medizin,  Universitatsspital,  Zurich,  Switzerland  (M.  Berger). 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

A  new  statistical  model  for  prediction  of  protein  secondary  structure  was  able  to  achieve  76% 
accuracy  on  three  structural  classes.  This  model  utilized  penalized  maximum  likelihood  techniques  for  a 
quadratic  logistic  model  based  on  17  residue  neighborhoods.  The  parameters  of  the  model  were 
interpreted  in  light  of  known  preference  patterns  for  residue-residue  contacts.  A  nonparametric  kernel 
density  estimation  approach  produced  greater  than  60%  accuracy,  and  could  effectively  incorporate  both 
homology  and  structural  class  information  into  the  predictions.  Data  visualization  techniques  were 
effectively  employed  to  aid  understanding  and  communication  of  features  of  the  Brookhaven  Data  Base 
of  protein  structures.  C-alpha  distance  and  C-alpha  contact  maps,  appropriately  presented,  revealed 
patterns  and  texttires  related  to  regularities  in  structure.  The  statistical  distribution  of  alpha-carbon  pair 
separation  distances  as  a  function  of  chain  separation  revealed  significant  patterns  related  to  secondary 
structures.  A  normal  and  lognormal  distribution  gave  good  approximation  to  the  observed  distribution, 
with  some  significant  departures. 

Our  previously  developed  algorithm  for  alignment  of  multiple  sequences  was  upgraded  and 
optimized  for  its  implementation  on  DOS-based  machines.  The  algorithm  is  presently  being 
implemented  on  the  Intel  parallel  machine  in  collaboration  with  CSL.  Other  groups  (ICOT-Japan)  have 
adapted  our  algorithm  to  advantage  using  parallel  architectures.  The  algorithm  promises  to  be  highly 
efficient  in  a  parallel  setting. 
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Statistical  and  Computational  Methods  for  Physiology,  Pharmacology  and  Endocrinology 
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J.  Zimmerberg  Chief  LTPB,  NICHD 

T.  Costa  Visiting  Scientist  ABS,  OD,  DCRT 


COOPERATING  UNITS  (if  any) 

University  of  Pittsburgh  Medical  Center  (W.  Winters);  Rush  Medical  College,  Dept.  of  Physiology  (F. 
Cohen);  University  of  Milan,  Italy  (G.  E.  Rovati);  Institute  of  Clinical  Obstetrics  and  Gynecology, 
University  of  Modena,  Italy  (A.  Genazzani). 


LAB/BRANCH  offj^g  of  the  Dircctor 


SECTION  Analytical  Biostatistics  Section 


iNSTfRjTE  AND  LOCATION   DCRT,  NIH,  Bcthcsda,  MD  20892 


TOTAL  MAN-YEARS:      0.5 


PROFESSIONAL:  0.5 


OTHER:     Q 


CHECK  APPROPRIATE  BOX(ES) 

n    (a)  Human  subjects  D    (b)  Human  tissues  S       (c)  Neither 

□  (a1)  Minors 

D  (a2)  Interviews 
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The  previously  developed  puJsatility  analysis  algorithm  (PULSEFIT)  was  applied  in  several 
additional  studies.  In  one,  small,  frequent  peaks  of  luteinizing  hormone  (LH)  couJd  be  discerned 
reliably,  and  confirmed  with  an  assay  for  the  alpha  subunit  of  LH.  Pulsatility  analysis  was  also 
conducted  in  an  ongoing  study  of  normal  children,  in  collaboration  with  researchers  at  EPA. 

A  mathematical  model  explaining  apparent  cooperative  binding  of  estrogen  to  receptor  was 
extended  to  account  for  the  temperattire  dependence  of  the  cooperative  phenomena.  The  model 
postulates  formation  of  a  heterodimer  of  receptor. 

A  novel  explanation  of  drug  efficacy  and  negative  antagonism  in  terms  of  a  ternary-complex  model 
was  produced.  The  model  explains  the  sodium  effect  and  certain  other  observations  for  G-protein 
mediated  cell  surface  receptors  such  as  the  opiate  receptor. 

Statistical  consultations  with  many  investigators  at  NTH  were  undertaken  in  support  of  statistical 
problems  in  ligand-binding,  receptor  modelling,  kinetic  modeling,  immunoassay,  and  dose-response 
analysis. 
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Logic  programming-based  query  system  chromosomal  information 
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SUMIVIARY  OF  WORK     (Use  Standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

A  computational  biology  "collaboratory"  of  computer  scientists  and  biologists  that  gathers  every  5 
months  for  3-5  days  at  the  NIH  or  ANL  to  work  on  the  development  of  a  new  set  of  integrated  tools  for 
the  manipulation  of  genomic  information  has  been  organized.  Development  of  these  informarics  tools 
and  assembly  and  analysis  of  the  integrated  data  continues  over  the  Internet  between  meetings.  The 
initial  objective  of  this  research  was  to  establish  the  minimal  criteria  necessary  to  describe  genomic  map 
data  that  may  be  logically  manipulated.  The  result  of  this  collaborative  effort  has  been  the  development 
of  several  prototype  deductive  database  systems:  First,  the  E.  coli  chromosome  query  system  that 
contains  information  provided  by  Kenn  Rudd  at  NCBI  for  aligned  DNA  sequences,  a  high  resolution 
physical  map,  identified  structural  genes,  and  an  aligned  phage  map;  second,  DCRT-  integrated 
collective  genetic  and  DNA  sequence  data  for  S.  typhimurium;  third,  an  integration  of  the  genome 
information  for  S.  pombe  provided  by  Hans  Lehrach  of  the  Imperial  Cancer  Research  Fund,  London, 
U.K.,  including  a  genetic  linkage  map,  and  Yeast  Artifical  Chromosome  (YAC)  and  cosmid- 
hybridization  data.  The  aligned  chromosome  information  for  each  of  these  prototypes  may  be  viewed 
using  a  common  graphical  display  program  developed  at  the  Argonne  National  Laboratory  (ANL)  for 
the  "collaboratory". 

A  new  technology,  the  integrated  Genome  Database  developed  by  our  ANL  colleagues,  allows  the 
integration  of  the  collected  genetic  and  physical  data  of  multiple  organisms.  We  have  developed 
numerous  tools  to  facilitate  the  rapid  integration  of  genomic  data  into  this  system.  The  common  feature 
of  each  of  these  prototype  data  representation  systems  is  that  each  system  uses  the  logic  programming 
language  Prolog.  We  can  rapidly  develop  complex  queries  of  the  integrated  data  that  take  advantage  of 
the  complicated  inter-relationships  inferred.  For  example,  in  the  E.  coli  system,  finding  the  longest 
repeated  sequences  that  are  found  on  the  same  face  of  the  DNA  helix  within  any  gene  is  a  simple  prolog 
query.  We  have  taken  advantage  of  this  advanced  query  capacity  to  begin  the  analysis  of  the  global 
organization  of  selected  genomes.  The  analysis  of  the  distribution  of  transcription  factor  binding  sites 
relative  to  known  promoters  and  genes  has  allowed  us  to  begin  defining  local  regulatory  grammars  for 
the  genetic  regulation  of  metabolic  pathways.  We  are  now  using  these  systems  to  correlate  the 
arrangement  of  different  types  of  genetic  information  represented  in  each  chromosome  type. 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

The  analysis  of  genomic  organization  requires  a  facility  that  allows  a  biologist  to  logically  manipulate 
the  complex  interrelated  information  thoughtfully.  A  collaborative  working  group  has  been  established 
to  explore  new  hardware  and  software  technologies  that  may  be  applied  to  the  analysis  of  genomic 
information.  The  goals  for  this  group  are  to  apply  logic  programming  to  the  development  of  prototype 
analysis  and  simulation  systems  for  biological  systems.  A  series  of  collaborative  visits  and  workshops 
have  been  used  to  define  three  areas  of  collaborative  research:  1)  simulation  of  protein  folding  using  rule 
based  methods;  2)  logical  manipulation  of  chromosomal  map  information;  and  3)  identification  of 
molecular  sequence  motifs.  The  results  of  this  collaborative  effort  have  been  presented  at  an 
international  meeting  and  pubUshed. 
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Understanding  protein-nucleic  acid  interactions 
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SUMMARY  OF  WORK    (Use  standard  unreduced  type.  Do  not  exceed  the  space  provided.) 

What  is  the  structure  and  mechanism  of  interaction  of  the  "zinc  finger"  metal  bridge  domains  with 
nucleic  acids?  This  question  is  addressed  through  a  detailed  structural  analysis  of  the  finger  regions 
from  several  known  nucleic  acid  binding  proteins.  A  database  of  zinc  finger  proteins  has  been 
assembled  for  the  purpose  of  statistical  and  structural  modeling  of  the  individual  finger  regions.  The 
Zinc  Finger  Database  was  established  to  accumulate  a  complete  collection  of  potential  zinc  finger  gene 
sequences  that  could  be  rapidly  searched  by  the  members  of  the  research  community  to  prevent  a  major 
duplication  of  sequencing  efforts.  This  collection  contains  both  pubUshed  and  unpublished  gene 
sequence  data.  The  database  is  available  for  sequence  comparisons  on  the  DCRT  Convex  240.  The 
service  provided  to  the  research  community  is  that  new  zinc  finger  gene  sequences  are  e-mailed  to  the 
NIH  and  added  to  the  database,  and  a  PASTA  search  of  the  results  is  returned  without  the  alignments.  A 
histogram  of  the  statistical  distribution  of  the  search  results  and  listing  of  the  potential  scores  are 
included.  When  a  match  to  an  unpublished  sequence  occurs,  then  the  name  and  contact  information  of 
the  submitting  author  are  returned  so  the  concerned  parties  may  correspond  with  each  other.  To  date  the 
collection  contains  178  different  entries,  which  is  a  20%  increase  compared  to  last  year.  This  large 
collection  of  functionally  related  sequences  has  provided  an  excellent  problem  set  for  multiple  sequence 
ahgnment  and  motif  analysis  tests. 

Statistical  analysis  of  these  data  have  revealed  5  repeat  classes  of  "zinc  finger"  metal  bridge  domains 
ranging  in  length  from  27  to  32  amino  acids  and  22  different  repeat  patterns.  Furthermore, 
compositional  statistics  of  the  largest  class  of  domains,  29  amino  acid  repeat  length,  reveals  a 
remarkable  conservation  of  serine  or  threonine  when  there  is  an  arginine  or  glutamine  in  the  DNA 
binding  region  of  the  finger  domain.  A  correlation  between  the  nucleic  acid  sequence  was  established 
and  some  domains  have  been  observed.  Physical  modeling  of  the  potential  DNA  interaction  hehcal 
regions  of  selected  domains  with  a  DNA  helix  have  revealed  some  details  of  dehydration  of  the  major 
groove  for  a  sequence- specific  recognition  event.  These  studies  are  continuing  to  explore  each  possible 
DNA  interaction  for  a  selected  set  of  finger  domains. 
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SUMMARY  OF  WORK     (Use  standard  unreduced  type.    Do  not  exceed  the  space  provided ) 

As  previously  reported,  the  Flow  Cytometry  Advancd  Data  Analysis  Project  (FC/ADA)  is  a  collaborative  laboratory 
automation  project  to  design  and  implement  a  production-oriented  basic  research  support  facility  capable  of  the  acquisition, 
archiving,  and  in-depth  analysis  of  multi-parameter  flow  cytomeuy  data. 

The  facility  permits  analyucal  techniques,  such  a.s  non-hicrarchical  cluster  analysis  and  multidimensional  gated 
histogramming,  to  be  applied  to  experimental  data.  A  data  staging  and  archiving  system  scaled  to  match  production  data 
acquisition  rates  is  also  provided  for  near-online  access  to  cxperimcnt^il  data  for  an  extended  period  of  time  and  automatic 
archival  storage  (and  retrieval)  of  all  expenmcntal  data. 

Supporting  more  than  55  Experimental  Immunology  Branch  (EIB)  investigators,  the  software  and  techniques  being 
developed  under  this  project  are  also  shared  with  other  How  cytometry  facilities  within  the  NIH  intramural  research  program 
and  with  the  FDA  Center  for  Biologies  Evaluation  and  Research. 

In  FY92  the  production  workload  of  the  EIB  i'aciliiy  was  shifted  to  the  FACSStar  Plus  cytomcter  and  its  associated 
VAXATvIS  data  management  and  analysis  system.    A  30  Gbyte  magneto-optical  disk  Hierarchical  File  Storage  System 
(HESS)  was  purchased  to  augment  the  existing  8mm  tape  archiving  system  with  near-online  storage.  The  Cluster 
Analysis  Program  (CAP)  has  been  refined  and  more  thoroughly  documented.  The  VMS  hosting  of  the  Laboratory 
Application  Package  (LAP)  hai  been  unproved  and  the  user  documcnuition  and  flow  cytomeu-y  specific  features  have  been 
significantly  enhanced. 

Work  in  FY93  will  center  on  system  tuning  and  load  balancing,  completion  of  system  and  user  documentation  and 
personnel  training.  Upon  arrival  of  ttie  HFSS  (early  1993),  the  h;irdwarc  and  software  will  be  installed  first  in  CSL  for 
testing  prior  to  permanent  installation  at  the  EIB  production  facility,  end  of  FY'^3.   The  HFSS  will  be  available  lo  the  EIB 
facility  during  this  period  over  the  NIHnct. 

Work  on  CAP  will  include  significant  user  interface  improvcincnLs  and  the  development  of  additional  classification 
algorithms  and  tacucs. 
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Applied  theoretical  research  on  AIDS  proteins  and  other  molecules  of  biomedical  interest  as  well  as  basic 
research  involving  macromolecules  is  in  progress. 

Molecular  dynamics  simulations  of  AIDS  proteins  involve  projects  directly  related  to  the  NIH  Intramural  AIDS 
Targeted  Antiviral  Program.  The  general  goal  is  to  understand  binding  interactions  with  HIV-1  proteins  in  order 
to  facilitate  the  design  of  drugs  which  may  interfere  with  the  spread  of  the  virus.  Important  therapeutic  targets 
under  study  include  HIV-1  reverse  transcriptase,  HIV-1  protease,  the  HIV-1  envelope  protein  gp120,  and  the 
CD4  receptor  protein  found  on  certain  host  cells.  Projects  include  modeling  of  leucine  zippers  in  GCN4  and 
HIV-1  reverse  transcriptase,  simulations  of  HIV-1  protease  monomer  in  solution,  analysis  of  inhibitor  binding  to 
the  active  site  of  HIV-1  protease,  and  investigation  of  the  mechanism  of  action  of  HIV-1  protease. 

Other  applied  research  on  molecules  of  biomedical  interest  uses  molecular  dynamics  simulations  to  predict 
function  or  structures  of  peptides  and  proteins.  Projects  include  modeling  the  metabolism-based 
transformation  of  myoglobin  to  an  oxidase  and  the  simulation  of  lattice  vibrations  in  the  L-alanine  crystal. 

Basic  research  is  underway  to  provide  a  better  understanding  of  biochemical  systems.  Projects  include  studies 
of  environmental  effects  on  protein  dynamics,  a  simulation  study  of  interleukin  1-beta,  comparison  with 
crystallographic  and  NMR  data,  harmonic  analysis  of  large  systems,  modeling  and  simulation  of  lipid  bilayers  (gel 
and  crystal  phases  in  particular),  structural  analysis  of  T4  lysozyme  mutants  in  the  harmonic  limit,  comparison  of 
simulations  on  staphylococcal  nuclease  with  NMR  data,  and  the  examination  of  long  range  deuterium  isotope 
effects  in  C-1 3  NMR  spectra. 
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Ongoing  efforts  in  this  area  include:  improving  methods  for  performing  free  energy  perturbation 
calculation  by  adding  new  features  and  options  such  as  a  potential  of  mean  force  to  remove  noise 
associated  with  high  frequency  vibrational  motion,  improving  and  evaluating  methods  for  treating  solvent 
implicitly  to  provide  for  hydrophobic  effects  without  the  explicit  inclusion  of  water  molecules,  methods 
to  properly  treat  electronic  polarization  in  molecular  dynamics  simulations,  and  continuing  the 
development  and  evaluation  of  more  accurate  flexible  water  models.  Projects  include  the  analysis  of 
hysteresis  in  free  energy  perturbation  simulations,  slow  growth  homology  modehng  applied  to  model 
systems  and  homologous  proteins,  development  of  quantum  mechanical  potentials  and  appropriate 
algorithms  for  use  in  molecular  dynamics  simulations,  studies  of  excited  state  and  electron  transfer 
processes  in  biological  systems,  semiempirical  Hartree-Fock  calculations  of  proteins,  new  methods  for 
long  range  truncation  of  the  potential  energy,  and  the  development  of  software  tools  to  automate  inhibitor 
design  and  evaluation  in  order  to  be  able  to  optimize  lead  compounds  in  rational  drug  design  efforts. 

Many  of  the  parameter  sets  and  models  that  are  generally  available  are  of  the  quality  required  for 
accurate  simulation  of  macromolecular  systems.  Therefore,  parameter  development  efforts  are  restricted 
to  areas  of  primary  interest  where  the  existing  parameter  sets  are  inadequate.  Ab  initio  chemistry,  crystal 
simulations,  vibrational  analysis,  solvated  molecular  dynamics  simulations,  and  free  energy  simulations 
are  being  used  in  this  effort.  One  such  example  is  the  development  of  parameters  for  simple  organic 
substituents  to  use  in  modehng  lipids.  Projects  include  development  of  van  der  Waals  parameters  for 
methylene  and  methyl  groups,  development  and  use  of  a  polarizable  and  flexible  water  model,  molecular 
dynamics  simulation  studies  of  DNA  in  both  finite  and  repeating  (infinite)  systems,  analysis  of  the 
protein  parameter  sets  using  carboxy-myoglobin,  and  conversion  of  physical  models  into  three-  dimensional 
coordinates. 
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Software  and  hardware  required  to  obtain  optimal  performance  for  simulation  research  are  being 
developed.  These  include  the  development  of  software  for  massively  parallel  Multiple  Instruction/Multiple 
Data  (MIMD)  machines,  microcoding  commercial  processors  to  obtain  optimal  performance,  and  the 
development  of  workstation  cluster  hardware  and  associated  software  for  optimal  performance  as  a  parallel 
computer. 

Massively  parallel  high  performance  computers  hold  great  promise  for  the  future  of  high  speed  scientific 
computing.  An  Intel  i860-based  machine  of  this  type  is  being  used  for  algorithm  development  and 
scientific  computing.  For  a  system  with  128  processors,  simulation  speedup  of  a  factor  of  75  (60% 
efficiency)  is  the  current  level  of  performance  for  a  macromolecular  system  with  14,000  atoms  and  a  13.5 
Angstrom  nonbonded  interaction  cutoff  distance.  Methods  for  this  class  of  machine  are  being  further 
designed  and  tested,  with  the  goal  of  using  this  system  for  future  research. 

Workstation  clusters  provide  a  highly  competitive  environment  in  terms  of  cost  performance  for 
macromolecular  simulations.  A  workstation  cluster  based  on  the  Hewlett-Packard-730  machine  has  been 
assembled.  Parallel  software  is  being  developed  and  is  being  evaluated  as  a  function  of  network 
connectivity  (Ethernet,  Token  ring,  or  fiber  ring  (FDDI).  The  software  under  development  for  the  parallel 
cluster  includes  both  CHARMM  for  macromolecular  simulation,  and  LIGHT,  an  NIH-developed  ray-trace 
raster  image  molecular  graphics  program. 
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The  EGAD  (Electronic  Grant  Application  Development)  project  is  reaching  a  stage  in  its 
implementation  where  it  is  important  to  have  some  understanding  of  how  information  will  flow. 
The  information  and  process  flow  diagrams  done  in  earlier  fiscal  years  have  been  taken  up  by 
speciahzed  contractors  who  are  charged  with  completing  and  refining  the  definition  of  the 
information  flow  and  implementation  processes.  The  institutionalization  of  the  electronic  grant 
application  process  means  that  ideas  and  issues  which  were  only  conjecture  or  'research'  several 
years  ago  are  being  turned  into  standard  authoritative  approaches.  Contractors  are  good  at  working 
out  a  task  which  has  been  defined  for  them  but  they  normally  do  not  have  the  authority  to  explore 
all  possibiUties.  Over  the  years  we  have  found  that  we  constantly  tread  the  boundary  between  what 
the  system  will  allow  us  to  think  or  do  and  the  areas  which  we  feel  we  must  explore  if  the  NIH  is 
to  be  a  viable  institution  in  the  future. 

We  have  been  looking  at  new  computer  network  tools  for  possible  application  in  the  grant 
review  process.  The  Gopher  network  tool  emerging  from  the  University  of  Minnesota  provides 
the  possibility  of  accessing  any  database  in  its  style  anywhere  on  the  InterNet.  We  have  been 
thinidng  about  how  Gopher  could  be  used  to  access  material  not  included  in  a  grant  appUcation  but 
which  nevertheless  provides  backup  or  support  for  an  appUcation.  The  WAIS  (Wide  Area 
Information  System)  emerging  from  Thinking  Systems  Inc.,  provides  a  network  mechanism  for 
searching  through  many  loosely  associated  files  for  specific  items  of  information.  We  have  been 
experimenting  with  the  possibility  of  using  WAIS  to  scan  through  collections  of  applications  to 
find  new  patterns  of  research  ideas.  Instead  of  being  lodged  on  a  central  computer,  these  loosely 
associated  files  could  be  kept  on  the  workstations  of  Scientific  Research  Administrators  in  the 
DRG.  WAIS  also  provides  an  alerting  function  in  the  sense  that  once  a  question  is  asked  by  a 
grant  administrator,  it  can  be  automatically  re-asked  periodically  by  the  network  system.  Now  that 
universal  network  connectivity  has  been  achieved,  we  have  been  trying  to  understand  what  new 
methods  of  use  for  scientific  and  administrative  work  might  be  possible. 
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The  programming  and  simulation  of  a  lattice  model  of  protein  folding  begun  last  year  was 
completed  by  the  ICOT  collaborators.  The  solvent  environment  of  protein  was  represented  by 
short-range  interactions  around  the  amino  acid  sidechains.  Temperature  modulated  simulated 
annealing  was  used  in  an  attempt  to  drive  the  protein  folding.  The  short  range  character  of  the 
forces  employed  prevented  the  simulation  from  succeeding.  In  trying  to  understand  the  role  of  the 
solvent  in  folding,  we  began  to  think  about  various  longer  range  forms  of  organization  which  the 
water  could  take.  The  polarizability  of  the  amide  bond  led  to  considering  a  chain  of  fluctuating 
hydrogen  bonds  between  the  waters  Unking  amide  nitrogen  to  the  carbonyl  oxygen.  Investigating 
the  structure  of  solvated  unfolded  protein  sequences  and  desolvated  folded  helices  of  typical 
proteins  led  to  the  reaUzation  that  the  topological  number  of  the  unfolded  and  folded  states  was  the 
same.  By  considering  the  number  of  free-electrons  in  each  of  the  20  amino  acids  we  were  able  to 
formulate  a  connectivity  based  model  of  protein  structure  in  which  all  the  transformations  involved 
in  folding  conserve  the  topological  number.  This  representation  combines  the  hydrophobic  and 
hydrophilic  character  of  each  amino  acid  in  consistent  fashion.  The  typical  transformation  event 
which  leads  to  protein  folding  is  analogous  to  what  is  known  in  physics  as  a  Feynman  diagram. 
The  exchange  of  two  hydrophUic  water  loops  is  conditioned  by  the  proximity  of  a  non-polarized 
hydrophobic  water  loop.  The  Feynman  exchanges  occur  throughout  the  protein  sequence  after  the 
logical  equivalent  of  ribosomal  synthesis.  Each  type  of  amino  acid  along  the  protein  sequence 
adds  specificity  to  the  pattern  of  water  loop  exchanges. 

A  new  program  has  been  written  at  NIH  to  implement  this  topological  model  of  protein 
folding.  Secondary  structure  in  the  form  of  helices  and  beta  sheet  strands  has  been  achieved  and 
the  packing  of  these  objects  is  now  under  way.  The  computer  capacity  required  for  this  folding 
method  is  quite  modest,  an  early  version  of  the  program  forms  the  secondary  structure  for  TIM 
(Triose  Phosphate  Isomerase),  a  typical  248  amino  acid  protein  in  about  6  hours  on  a  personal 
computer.  Complete  packing  and  three-dimensional  optimization  should  take  perhaps  2-3  times 
more  computational  power. 
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We  are  trying  to  understand  the  interaction  between  hardware  design,  operating  system  style, 
and  programming  style  as  the  hardware  and  software  technologies  evolve.  The  Institute  for  New 
Generation  Computer  Technology  (ICOT)  installed  in  the  DCRT  one  of  the  PSI-III  logic 
programming  computers  in  October  1991.  The  PSI-HI  computer  runs  a  conventional  Open 
Software  Foundation  (OSF)  operating  environment.  We  investigated  the  characteristics  of  this 
machine  but  found  that  the  implementation  was  too  immature  to  be  of  any  scientific  use  for 
American  workers.  Using  the  InterNet  from  workstations  in  our  computational  environment,  we 
were  able  to  successfully  communicate  with  our  collaborators  at  ICOT.  During  the  previous  year 
the  InterNet  connection  between  the  USA  and  Japan  had  not  been  reliable  enough  for  day-by-day 
working  communications.  The  daily  transmission  of  faxes  had  served  our  communication  needs. 
In  switching  to  InterNet  communications  we  found  that  letters,  data  and  manuscripts  could  be  more 
rapidly  and  effectively  transmitted.  In  searching  for  tools  to  more  effectively  use  the  InterNet,  we 
found  Gopher  from  the  University  of  Minnesota  and  WAIS  from  Thinking  Machines  Inc.  We 
investigated  representing  and  retrieving  biological  information  using  these  tools.    Our  collaborators 
at  ICOT  were  able  to  move  the  program  for  simulating  protein  folding  from  one  processor  to  a 
256-processor  array.  As  they  demonstrated  at  the  Fifth  Generation  Computer  Systems  (FGCS) 
symposium  in  Tokyo  in  June  1992,  the  program  runs  180  times  faster.  This  is  about  70%  parallel 
efficiency  which  is  very  good.  We  plan  to  collaboratively  apply  this  parallel  computational 
technology  to  a  number  of  other  biological  problems  in  the  next  years. 

This  project  is  terminated  at  the  end  of  FY92. 
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Multimodality  Research  Image  Processing  System 
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PI:      Margaret  A.  Douglas,  B.A.  Computer  Systems  Analyst  (DCRT/LAS) 
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This  project  is  to  develop  an  extensible  image  processing  system  (MRIPS)  to  study 
multidimensional  (2D  to  lOD)  data  from  multiple  imaging  modalities.   The  system  is 
based  on  a  common  hardware  and  software  environment  across  NIH  and  is  to  be  the 
standard  for  macroscopic  image  processing  at  NIH.   The  primary  use  will  be  the  vi- 
sualization and  analysis  of  medical  images.   The  system  consists  of  data/compu- 
tation servers,  workstations,  off-the-shelf  system  software,  and  customized  image 
processing  software  suitable  for  both  the  development  of  new  image  visualization 
and  analysis  tools  and  the  use  of  existing  image  processing  software  packages. 
This  system  will  be  used  by  trainees  associated  with  the  DRRP  and  other  NIH  scien- 
tists for  the  analysis  of  medical  images  obtained  by  computerized  tomography  (CT) , 
magnetic  resonance  imaging  (MRI),  magnetic  resonance  spectroscopy  (MRS),  positron 
emission  tomography  (PET),  single  photon  emission  tomography  (SPECT),  echo,  etc. 
Many  of  the  studies  will  involve  determination  of  the  relationship  between  anatomic 
and  physiologic  image  data  obtained  from  various  tissue  and  organ  systems.   Of  par- 
ticular importance  in  this  regard  is  the  need  to  accurately  and  efficiently  obtain 
spatial  registration  and  segmentation  of  data  collected  with  these  different  modal- 
ities.  Therefore,  central  to  the  design  of  MRIPS  is  the  creation  of  an  image  reg- 
istry for  short  term  storage  of  images  from  all  supported  modalities  to  facilitate 
selection  of  data  from  multiple  modalities.   The  hardware  environment  of  the  MRIPS 
is  one  of  network-connected  workstations  and  file  servers.   The  servers  will  pro- 
vide access  to  data  through  importation  from  either  tape-  or  network-based  scanners 
at  NIH.   The  workstations  and  servers  will  use  the  Network  File  System  or  the  An- 
drew File  System  to  provide  a  homogeneous  file  system.   Most  of  the  major  2D  and  3D 
medical  imaging  applications  may  be  handled  in  MRIPS 's  software  framework  for  the 
near  future.   This  framework  can  be  tailored  to  the  specific  acquisition  and  visu- 
alization medium  to  facilitate  data  exchange,  storage,  retrieval  and  multimodality 
comparisons.   In  contrast  to  dissimilar  systems  now  in  use  at  NIH,  this  common  sys- 
tem  will  also  promote  shared  development,  testing  and  exchange  of  new  algorithms. 
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Previous  collaboratjve  efforts  between  the  Nucle;ir  Medicine  Dcparunent  and  CSL  have  demonstrated  the  need  for  methods 
to  correct  for  head  motion  artifact  during  planar  gamma  camera  studies  of  the  brain.  Since  no  suitable  commercial 
position/orientation  measurement  systems  that  met  ail  the  requirements  could  be  idcniificd,  CSL  is  now  adapting  a 
commercial  system  to  perform  the  necessary  corrections. 

CSL  has  purchased  an  Intel  Multibus  II  computer  system  that  will  allow  energy,  geometric,  and  motjon  corrections  to  be 
performed  in  real-ume  on  data  from  a  small  field-of-view  (FOV)  gamma  camera  recently  purchased  by  the  Nuclear  Medicine 
Depanment  of  the  Clinical  Center.  This  small  FOV  camera  utilizes  a  single  position-sensitive  photomultiplier  tube 
(PMT)  instead  of  muluple  (non-position  sensitive)  PMTs  used  in  scmdard  large  FOV  gamma  cameras.  This  image 
correction  system  will  be  interposed  between  the  gamma  camera  and  its  data  acquisition  and  processing  computer, 
correcting  the  data  as  they  are  transmitted  from  the  camera  to  ihe  computer. 

System  control  software  has  been  developed  by  CSL,  as  well  as  programs  to  display  byte  arrays  and  data  acquired  by  direct 
memory  access  (DMS)  from  the  A/D  converter  modules.  Software  development  will  be  complete  by  the  end  o(  FY93. 
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Beginnmg  m  1976  CSL  developed  and  installed  eleven  Laboratory  Data  Acquisition  and  Control  System  (LDACS) 
computer  systems  throughout  Building  2.  Based  on  Digital  Equipment  Corporation  (DEC)  LSI-1 1  micro-computers,  the 
LDACS  computers  were  connected  to  laboratory  instruments  for  control  and  data  collection.  Collected  data  were  then 
transferred  to  a  central  computer  over  low-speed  senal  lines  for  further  processing. 

CSL  plans  to  replace  the  four  LDACS  computers  still  in  routine  use  for  which  commercial  equivalents  systems  are  not 
available  with  a  mix  of  IBM  PC  and  Apple  Macintosh  computers.  The  new  systems  will  perform  the  same  funcuons  as 
the  LDACS  but  will  be  connected  to  the  buildmg  Local  Area  Network  (LAN). 

The  new  computers  will  be  equipped  with  the  appropnate  interface  components  to  control  the  laboratory  instruments.  The 
current  range  of  desktop  computers  are  considerably  less  expensive  (S5-10K),  offer  more  performance  and  are  potentially 
less  difficult  to  program  than  the  original  LDACS. 

The  software  will  be  modular:  small  programs  to  perform  minimal  tasks  (i.e.  temperature  measurement)  invoked  from  a 
general  user  inferface  program.  The  user  interface  will  have  a  high  degree  of  compatibility  with  the  existing  LDACS 
system  and  user  screens  will  be  easily  modified.  CSL  plans  to  replace  two  LDACS  m  the  current  FY.  The  first  unit,  a 
Perkin  Elmer  580B  infrared  spectrophotometer  is  now  in  use  with  the  new  PC  interface.  The  second  unit,  a  CARY  210 
UV  spectrophotometer  is  under  development.  It  is  anticipated  that  this  system  will  be  general  enough  for  use  in  other 
research  labs  at  NIH. 
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PRINCIPAL  INVESTIGATOR  (List  other  professional  personnel  below  the  Principal  Investigator.)    (Name,  title,  laboratory,  and  institute  affiliation) 
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SUMMARY  OF  WORK    (Use  standard  unreduced  type.   Do  not  exceed  the  space  provided.) 

The  ability  to  measure  directly  the  forces  between  membranes  or  between  macromolecules  is  creating  a 
new  logic  for  thinking  about  molecular  recognition,  assembly,  and  folding.  The  outstanding  feature  of 
interaction  is  that  as  molecules  or  membranes  approach  contact,  the  important  work  of  approach 
involves  removal  of  organized  water  solvent  from  the  apposing  surfaces.  These  "hydration  forces"  are 
now  recognized  to  act  in  materials  as  diverse  as  lipid  bilayers,  proteins,  DNA  double  helices,  and  stiff 
polysaccharides. 

During  the  current  year  a  first  direct  measurement  of  forces  between  protein  molecules  (type  I  collagen 
triple  helices)  has  succeeded.  It  has  been  shown  that  the  force  has  all  the  features  characteristic  of 
hydration  forces.  The  temperature  dependence  of  the  force  is  similar  to  that  observed  in  ordered  arrays 
of  DNA  molecules.  This  shows  that  physical  nature  of  temperature-favored  assembly  in  DNA  and 
proteins  might  be  similar.  Temperature-favored  assembly  is  a  common  feature  of  many  biologically 
important  processes. 

The  theory  of  temperature-favored  assembly  induced  by  attractive  hydration  forces  between  hydrophUic 
molecules  has  been  developed. 

Measurement  of  interaction  forces  between  dihexadecyldimethylammonium  acetate  bilayers  has 
demonstrated  that  neither  thermal-mechanical  undulations  nor  molecular  protrusions  contribute 
significantiy  to  hydration  forces  between  lipid  bilayers. 


PHS  6040  (Rev  1/84)  us, oovernmentprintino office  1991  o-8m-» 


OEPARTUEfa  OF  HEALTH  AND  HUVAN  SERVICES  ■  PUBLIC  HEALTH  SERVICE 
NOTICE    OF   INTRAMURAL    RESEARCH    PROJECT 


PROJECT  Nu^eeR 
ZOl  CT00242-01  PSL 


PERIOO  CXDVERED 

October  1,  1991  to  September  30,  1992 


Tm£  OF  PROJECT      (80  ctiaractefs  of  Ie8».  Title  muat  fit  on  on«  line  tjetween  ttie  borders.) 
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Among  the  "allosteric  effectors"  regulating  Hemoglobin 
oxygen  affinity,  chloride  ions  have  long  been  known  to  have  a 
significant  physiological  role.   From  in  vitro  studies  of  the 

relation  between  O  binding  and  Cl~  activity,  it  has  been  thought 

that  some  1.6  Chlorides  bind  to  stabilize  the  "deoxy"  or  "T"  form 
[ 1-4  ] .   Last  year  we  discovered  that  the  oxygen  affinity  of 
Hemoglobin  correlates  with  water  activity  in  the  presence  of 
several  neutral  solutes .   We  inferred  a  change  in  protein 
hydration  in  the  deoxy  T  state  to  fully  oxygenated  R  state 
transition  of  some  60  -  65  waters.   We  were  forced  to  ask,  then, 
whether  salt  could  have  an  osmotic  side-effect  similar  to  that 
seen  with  neutral  solutes.   We  therefore  re-examined  the 
regulatory  action  of  chloride,  explicitly  including  changes  in 
protein  hydration  and  the  dependence  of  water  activity  on  added 
salt.   We  now  find  that  an  alternate  description  of  the  data  has 

only  a  single  allosteric  chloride  ion  fAn   -  -1.02  ±  0.02  CI'; 

directly  linked  with  oxygenation  during  the  deoxy-to-oxy 
transition,  while  some  65.2  ±2.4  additional  water  molecules  bind 

to  the  protein  [(5)].   Within  this  analysis,  the  Cl~-regulated 
loading  of  four  oxygens  can  be  described  by  the  reaction, 

Hb'Cl  +  40  +  65H  0   <==>  Hb»40  •65H  0  +  CI'. 

2  2  2       2 

Far  more  important  than  simply  a  matter  of  Hemoglobin  or 
chloride  alone,  we  must  now  face  the  possibility  of  a  general 
bias  in  gauging  the  action  of  "effectors"  of  protein  function,  a 
bias  from  neglect  of  solvation  and  the  new  hydration  forces  that 
seem  to  be  ubiquitous  among  bio-molecules . 
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Work  began  on  this  project  during  FY92.  Its  goals  are: 

(1)  to  provide  services  to  members  of  the  NIH  research  community  exploiting  artificial  neural  network 
(ANN)  computational  paradigms.  These  applications  range  from  noise  ehmination  to  medical  decision 
making  to  evaluation  of  screening  of  therapeutic  agents.  ANN  methods  are  attractive  alternatives  in 
some  problems  also  approachable  by  traditional  statistical  methods,  especially  when  the  need  for  cheap 
rapid  aquisition  of  a  model  outweighs  the  need  or  precise  measurement  of  error  or  confidence.  Tlie 
project  will  study  and  support,  for  the  NIH  community,  the  most  useful  generalized  ANN  software 
platforms,  consult  with,  and  may  collaboratively  assist  biomedical  scientists  to  utilize  ANNs.  A 
bi-weekly  network  interest  group  in  a  journal  club  format  begain  with  a  nucleus  of  interested  NTH 
colleagues.  A  4-day  neural  network  course,  taught  by  an  outside  expert,  and  open  to  the  NIH 
community,  began  an  educational  effort  in  September. 

(2)  Studies  by  LSM  staff  included  surveys  of  existing  ANN  applications  in  two  fields:  medical 
diagnosis  and  DNA  protein  sequence  analysis.  Preliminary  formulation  was  done  for  algorithms  for 
training  neural  networks  with  unequal  error  weighting  (e.g.,  when  screening  for  a  serious  iUness  the 
error  of  falsely  diagnosing  the  illness  might  be  judged  preferable  to  the  error  of  failing  to  detect  it),  and 
for  the  leaming  of  "hard"  Boolean  functions  (a  bid  to  raise  the  level  of  Kolmogorof  complexity  in  data 
which  neural  net  models  are  able  to  represent). 
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The  Gibbs  Sampler  algorithm  is  based  on  a  small  but  powerful  set  of  results  in  probability  and 
mathematical  statistics.  These  results  guarantee  both  the  technical  rigor  and  the  broad  applicability  of  the 
method.  There  are,  however,  non-trivial  issues  concerned  with  convergence  and  implementation  and 
these  were  examined. 

We  have  fully  implemented  the  Gibbs  Sampler  on  the  Intel  iPSC/860  (Hypercube)  in  DCRT. 
Speed-ups  of  nearly  two  orders  of  magnitude  have  been  obtained:  In  one  problem,  requiring  more  than 
100  parametthes,  the  algorithm  took  about  5  seconds  to  analyze  on  the  Hypercube,  as  compared  with 
nearly  45  minutes  on  the  Convex  Supercomputer  in  DCRT.  Such  increases  in  computational  efficiency 
allow  the  biomedical  community  to  work  on  very  difficult  problems  in  a  real-time,  interactive  way. 

The  Sampler  thus  greatly  expands  on  the  conventional  understanding  of  "reasonable"  and  "tractable" 
biological  models,  and  allows  for  very  high-dimensional  (many  parameter)  data  analyses.  It  has  been 
used  by  us  for  real  clinical  studies:  see  Knebel  et  al.  (1992),  "Weaning  from  Mechanical  Ventilation  vs. 
Pressure  Support  Ventilation:  Comparison  of  Dyspnea,  Anxiety  and  Inspiratory  Effort,"  (submitted  to 
The  American  Review  of  Respiratory  Diseases.) 

In  this  study,  we  also  compared  alternative  classical,  still  technically  non-trivial  methods,  including 
the  Expectation-Maximization  method.  The  truly  classical  methods  for  this  ventilator  problem  require 
that  every  case  having  any  missing  points  at  all  is  entirely  deleted  from  the  analysis;  the  Gibbs 
Sampler,  on  the  other  hand,  smoothly  allows  for  missing  data. 

Moreover,  the  results  from  the  several  methods  (classical  or  Gibbs)  are  not  always  identical,  telling  us 
that  they  each  "see"  the  data  in  a  different  way.  These  differences,  in  turn,  have  clinical  consequences 
and  suggest  new  questions  and  ideas  for  the  researcher  (e.g.  better  clinical  criteria  for  ventilatory 
weaning).  We  note  that  advanced  but  distinct  statistical  methods  can  often  result  in  such  differences, 
sometimes  dramatically  so,  and  thus  lead  to  the  researcher  to  ask  more  refined,  more  focused  questions, 
as  well  as  possibly  resulting  in  a  complete  change  in  what  is  considered  current  "best"  practice. 

Some  of  our  practical  and  theoretical  results  appeared  in  a  paper  at  the  Annual  Meeting  of  the 
American  Statistical  Association. 
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The  Polymerase  Chain  Reaction  (PCR)  procedure  is  a  major  laboratory  technique  used  in  the  Human 
Genome  Project  for  amplification  of  sequence  fragments.  One  of  the  difficulties  encountered  when 
using  PCR  is  that  the  product  can  be  contaminated  with  the  DNA  of  non-genomic  sequences,  most 
notably  mitochondrial  DNA.  PCR  is  initiated  by  using  small  sequences,  usually  20  or  fewer  bases,  that 
act  as  primers  for  the  reaction. 

This  work  is  an  attempt  to  recognize  primers  to  avoid,  in  the  sense  that  they  will  cause  mitochondrial 
DNA  to  ampUfy  and  contaminate  the  genomic  DNA  product.  Computer  programs  were  written  that,  for 
a  given  primer,  first  find  all  locations  of  them  in  human  mitochondrial  DNA,  and  then  compute  whether 
PCR  ampUfication  can  be  expected,  based  on  combinations  of  primer  locations  that  are  known  to  cause 
amphfication. 

One  of  the  programs  being  written  analyzes  the  frequency  of  occurrence  of  all  possible  3-  to  10-mers  of 
the  four  bases  --  A,  C,  G,  T  --  in  given  sequences.  This  program  will  also  enable  comparison  of  all 
known  mitochondrial  sequences.  Data  are  being  analyzed  which  indicate  different  patterns  of  sequence 
usage  based  upon  the  species  of  origin  of  the  mitochondrial  DNA.  These  analyses  may  yield  valuable 
information  on  the  origin  of  the  mitochondria  of  the  different  organisms,  as  well  as  information  pertinent 
to  the  regulation  of  mitochondrial  DNA  function  and  structure. 

Paper  in  preparation: 

"Mitochondrial  DNA  can  be  an  efficient  competitor  in  STS  PCR", 

Zullo,  Kennedy,  Gelemter,  Polymeropolous,  Shapiro,  Tallini,  Merril,  and  Kidd 
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